November 20, 2023
Memory Usage
#
def memory():
with open('/proc/meminfo', 'r') as mem:
ret = {}
tmp = 0
for i in mem:
sline = i.split()
if str(sline[0]) == 'MemTotal:':
ret['total'] = int(sline[1])
elif str(sline[0]) in ('MemFree:', 'Buffers:', 'Cached:'):
tmp += int(sline[1])
ret['free'] = tmp
ret['used'] = int(ret['total']) - int(ret['free'])
return ret
No Hang Up
#
nohup jupyter notebook --no-browser > notebook.log 2>&1 &
Workaround: no cells output
#
se = time.time()
print(train.rdd.getNumPartitions())
print(test.rdd.getNumPartitions())
e = time.time()
print("Training time = {}".format(e - se))
your_float_variable = (e - se)
comment = "Training time for getnumpartition:"
# Open the file in append mode and write the comment and variable
with open('output.txt', 'a') as f:
f.write(f"{comment} {your_float_variable}\n")
October 24, 2023
1 pip uninstall plotly
2 jupyter labextension uninstall @jupyterlab/plotly-extension
3 jupyter labextension uninstall jupyterlab-plotly
4 jupyter labextension uninstall plotlywidget
5 jupyter labextension update --all
6 pip install plotly==5.17.0
7 pip install "jupyterlab>=3" "ipywidgets>=7.6"
8 pip install jupyter-dash
9 jupyter labextension list
Useful Links
#
June 8, 2023
Thanks to the Jupyter community, it’s now much easier to run PySpark on Jupyter using Docker.
There are two ways you can do this : 1. the “direct” way and 2. the customized way.
The “direct” way
#