with open('data-dev/jsondict-test.json', 'r') as fp:
d = json.load(fp)
d
d.items()dict_items([('b', 2), ('c', 3), ('d', 4)])
coreList of fastcore object to load to use once eccore is imported
from basics:
gen, chunkedNS, SimpleNameSpacestore_attrgetattrs, hasattrs, setattrslistify, tuplify, setify, uniqueify, stringify…)filter_x, argwhere,val2idxpatch_toPrettyStringipython_shell, in_ipython, in_colab, in_jupyter, in_notebookfrom foundation:
LConfigfrom Utility functions:
untar_dirrunPath with added capabilitiesfrom Meta:
delegates, use_kwargs, funcs_kwargs, methodClasses to handle data structure more easily
dict_items([('b', 2), ('c', 3), ('d', 4)])
JsonDict (p2json:str|pathlib.Path, dictionary:Optional[dict]=None)
*Dictionary whose current value is mirrored in a json file and can be initated from a json file
JsonDict requires a path to json file at creation. An optional dict can be passed as argument.
Behavior at creation:
JsonDict(p2json, dict) will create a JsonDict with key-values from dict, and mirrored in p2jsonJsonDict(p2json) will create a JsonDict with empty dictionary and load json content if file existsOnce created, JsonDict instances behave exactly as a dictionary*
| Type | Default | Details | |
|---|---|---|---|
| p2json | str | pathlib.Path | path to the json file to mirror with the dictionary | |
| dictionary | Optional | None | optional dictionary to initialize the JsonDict |
Create a new dictionary mirrored to a JSON file:
d = {'a': 1, 'b': 2, 'c': 3}
p2json = Path('data-dev/jsondict-test.json')
jsond = JsonDict(p2json, d)
jsond{'a': 1, 'b': 2, 'c': 3}
dict mirrored in /home/vtec/projects/ec-packages/eccore/nbs-dev/data-dev/jsondict-test.json
Once created, the JsonFile instance behaves exactly like a dictionary, with the added benefit that any change to the dictionary is automatically saved to the JSON file.
key: a; value: 1
key: b; value: 2
key: c; value: 3
Adding or removing a value from the dictionary works in the same way as for a normal dictionary. But the json file is automatically updated.
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
dict mirrored in /home/vtec/projects/ec-packages/eccore/nbs-dev/data-dev/jsondict-test.json
{'b': 2, 'c': 3, 'd': 4}
dict mirrored in /home/vtec/projects/ec-packages/eccore/nbs-dev/data-dev/jsondict-test.json
is_type (obj:Any, obj_type:type, raise_error:bool=False)
Validate that obj is of type obj_type. Raise error in the negative when raise_error is True
| Type | Default | Details | |
|---|---|---|---|
| obj | Any | object whose type to validate | |
| obj_type | type | expected type for obj |
|
| raise_error | bool | False | when True, raise a ValueError is obj is not of the right type |
| Returns | bool | True when obj is of the right type, False otherwise |
Functions to ensure path are properly formated and point to a real file or directory.
validate_path (path:str|pathlib.Path, path_type:str='file', raise_error:bool=False)
Validate that path is a Path or str and points to a real file or directory
| Type | Default | Details | |
|---|---|---|---|
| path | str | pathlib.Path | path to validate | |
| path_type | str | file | type of the target path: 'file', 'dir' or 'any' |
| raise_error | bool | False | when True, raise a ValueError is path does not a file |
| Returns | bool | True when path is a valid path, False otherwise |
safe_path (path:str|pathlib.Path)
*Return a Path object when given a valid path as a str or a Path, raise error otherwise
Note: This function does not check whether the file or directory exists.*
| Type | Details | |
|---|---|---|
| path | str | pathlib.Path | path to validate |
| Returns | Path | validated path returned as a pathlib.Path |
get_config_value (section:str, key:str, path_to_config_file:Union[pathlib.Path,str,NoneType]=No ne)
*Returns the value corresponding to the key-value pair in the configuration file (configparser format)
When no path_to_config_file is provided, the function will try to find the file in: the system’s home, the parent directory of the current directory, and the Google drive directory mounted to the Colab environment.*
| Type | Default | Details | |
|---|---|---|---|
| section | str | section in the configparser cfg file | |
| key | str | key in the selected section | |
| path_to_config_file | Union | None | path to the cfg file |
| Returns | Any | the value corresponding to section>key>value |
By defaults (path_to_config_file is None), it is assumed that the configuration file is located in: - the local package config directory (home/.eccore/) - the working directory - the folder above the working directory - the private-accross-accounts directory on google drive.
File names are expected to be either config-api-keys.cfg or config-sample.cfg.
If not, a path to the file (Path or str) must be provided.
The configuration file is expected to be in the format used by the standard module configparser documentation
[DEFAULT]
key = value
[section_name]
key = value
[section_name]
key = value
found /home/vtec/projects/ec-packages/eccore/config-api-keys.cfg
Using config file at /home/vtec/projects/ec-packages/eccore/config-api-keys.cfg
'Etienne Charlier'
path2cfg = Path('../config-sample.cfg').resolve()
assert path2cfg.is_file(), f"{path2cfg} is not a file"
print(path2cfg.absolute())
with open(path2cfg, 'r') as fp:
print(fp.read())/home/vtec/projects/ec-packages/eccore/config-sample.cfg
[azure]
azure-api-key= dummy_api_key_for_azure
[github]
git_name = not_my_real_github_name
git_email = not_my_real_git_email
github_username = not_my_real_git_username
[kaggle]
kaggle_username = not_my_real_kaggle_name
kaggle_key = dummy_api_key_for_kaggle
[wandb]
api_key = dummy_api_key_for_wandb
value = get_config_value(section='azure', key='azure-api-key', path_to_config_file=path2cfg)
assert value == 'dummy_api_key_for_azure'Using config file at /home/vtec/projects/ec-packages/eccore/config-sample.cfg
value = get_config_value(section='kaggle', key='kaggle_username', path_to_config_file=path2cfg)
assert value == 'not_my_real_kaggle_name'Using config file at /home/vtec/projects/ec-packages/eccore/config-sample.cfg
CurrentMachine (*args, **kwargs)
*Callable class representing the current machine. When called, instance return a dict all attrs:
os: the operating system running on the machinehome: path to home on the machineis_local, is_colab, is_kaggle: whether the machine is running locally or notp2config: path to the config filepackage_root: path to the package root directoryCurrentMachine is a singleton class.*
{'os': 'linux',
'home': Path('/home/vtec'),
'is_local': True,
'is_colab': False,
'is_kaggle': False,
'p2config': Path('/home/vtec/.ecutilities/ecutilities.cfg'),
'package_root': Path('/home/vtec/projects/ec-packages/eccore')}
This machine is not registered a local machine, but is also not running in the cloud. We should register it as a local machine with register_as_local
CurrentMachine.register_as_local ()
Update the configuration file to register the machine as local machine
Use this method to register the current machine as local machine. Only needs to be used once on a machine. Do not use on cloud VMs
(True, False, False)
Technical Note:
The configuration file is located at a standard location, which varies depending on the OS:
- Windows:
- home is
C:\Users\username- application data in
C:\Users\username\AppData/Local/...orC:\Users\username\AppData\Roaming\...(see StackExchange)- application also can be loaded under a dedicated directory under
C:\Users\usernamelikeC:\Users\username\.conda\...- Linux:
- home is
/home/username- application data in a file or dedicated directory
/home/username/s.a.:
- file in home directory, e.g.
.gitconfig- file in an application dedicated directory, e.g.
/home/username/.conda/...
ecutilitiesplaces the configuration file in a dedicated directory in the home directory: -C:\Users\username\.ecutilities\ecutilities.cfg-/home/username/.ecutilities/ecutilities.cfgRetrieve the OS:
win32 with Windows linux with linux darwin with macOsAccessing the correct path depending on the OS:
WindowsPath('C:/Users/username') with Windows Path('/home/username') with linux
ProjectFileSystem (*args, **kwargs)
*Class representing the project file system and key subfolders (data, nbs, src)
Set paths to key directories, according to whether the code is running locally or in the cloud. Give access to path to these key folders and information about the environment.*
{'os': 'linux',
'home': Path('/home/vtec'),
'is_local': True,
'is_colab': False,
'is_kaggle': False,
'p2config': Path('/home/vtec/.ecutilities/ecutilities.cfg'),
'package_root': Path('/home/vtec/projects/ec-packages/eccore')}
ProjectFileSystem.create_project_file_system (p2project_root, overwrite=False)
*Create a standard project file system with the following structure:
project_root
|--- data all data files
|--- nbs all notebooks for work and experiments
|--- src all scripts and code
```*
| | **Type** | **Default** | **Details** |
| -- | -------- | ----------- | ----------- |
| p2project_root | | | path to project root, where all subfolder will be located |
| overwrite | bool | False | overwrite current folders if they exist when True (not implemented yet) |
::: {#efcea5b2 .cell}
``` {.python .cell-code}
pfs.create_project_file_system(Path('/home/vtec/projects/ec-packages/eccore'))
/home/vtec/projects/ec-packages/eccore/data
/home/vtec/projects/ec-packages/eccore/nbs
/home/vtec/projects/ec-packages/eccore/src
Created project file system in /home/vtec/projects/ec-packages/eccore
:::
setup_logging (logfile:pathlib.Path|None=None)
Setup logging to console and to file if logfile is not None
logthis (*args)
Logs all elements passed to logs
monitor_fn (fn)
Highlights when function in entered to and exited from
After setting up the logging, it is easy to create log entries:
Logging to console and to /home/vtec/projects/ec-packages/eccore/nbs-dev/data-dev/dev.log.
Logging setup finished
logging.info('Logging manually as info, only shows in logfile')
logging.warning('Logging manually as warning, shows in logfile and console')
logthis('Using log function, as info, only shows in the log file')2025-04-27 20:45:42: Logging manually as warning, shows in logfile and console
See logfile content:
if p2log.exists():
print('Log file content:')
with open(p2log, 'r') as f:
print(''.join(f.readlines()))Log file content:
2025-04-27 20:45:42: Logging manually as info, only shows in logfile
2025-04-27 20:45:42: Logging manually as warning, shows in logfile and console
2025-04-27 20:45:42: Using log function, as info, only shows in the log file
Create a function decorated with @monitor_fn to monitor function calls, i.e. when function in entered and exited.
@monitor_fn
def a_function(a,b):
"""Test functions to add two numbers"""
return a + b
print(f"function output is {a_function(1,2)}")
print(f"")function output is 3
if p2log.exists():
print('Log file content:')
with open(p2log, 'r') as f:
print(''.join(f.readlines()))
p2log.unlink()Log file content:
2025-04-27 20:45:42: Logging manually as info, only shows in logfile
2025-04-27 20:45:42: Logging manually as warning, shows in logfile and console
2025-04-27 20:45:42: Using log function, as info, only shows in the log file
2025-04-27 20:46:59: Entering `a_function`
2025-04-27 20:46:59: Exiting `a_function`
files_in_tree (path:str|pathlib.Path, pattern:str|None=None)
List files in directory and its subdiretories, print tree starting from parent directory
| Type | Default | Details | |
|---|---|---|---|
| path | str | pathlib.Path | path to the directory to scan | |
| pattern | str | None | None | pattern (glob style) to match in file name to filter the content |
p2dir = Path('').resolve()
print(p2dir, '\n')
files = files_in_tree(p2dir)
print(f"List of {len(files)} files when unfiltered")/home/vtec/projects/ec-packages/eccore/nbs-dev
eccore
|--nbs-dev
| |--0_02_plotting.ipynb (0)
| |--0_01_ipython.ipynb (1)
| |--0_00_core.ipynb (2)
| |--.last_checked (3)
| |--sidebar.yml (4)
| |--index.ipynb (5)
| |--nbdev.yml (6)
| |--9_01_dev_utils.ipynb (7)
| |--styles.css (8)
| |--_quarto.yml (9)
| |--data-dev
| | |--jsondict-test.json (10)
| | |--ten-blobs-6-cols-clusters.npy (11)
| | |--ten-blobs-6-cols-y.npy (12)
| | |--ten-blobs-6-cols-X.npy (13)
List of 14 files when unfiltered
Use pattern to filter the paths to return (using glob syntax)
eccore
|--nbs-dev
| |--0_02_plotting.ipynb (0)
| |--0_01_ipython.ipynb (1)
| |--0_00_core.ipynb (2)
| |--index.ipynb (3)
| |--9_01_dev_utils.ipynb (4)
| |--data-dev
List of 5 files when filtered
path_to_parent_dir (pattern:str, path:str|pathlib.Path|None=None)
*Climb directory tree up to a directory starting with pattern, and return its path.
When no directory is found in the tree starting with pattern, return the current directory path.
It is possible to pass a path as starting path to climb from.*
| Type | Default | Details | |
|---|---|---|---|
| pattern | str | pattern to identify the parent directory | |
| path | str | pathlib.Path | None | None | optional path from where to seek for parent directory |
| Returns | Path | path of the parent directory |
p2dir = path_to_parent_dir('nbs')
assert 'nbs-dev' in p2dir.parts and 'nbs' not in p2dir.parts
p2dirPath('/home/vtec/projects/ec-packages/eccore/nbs-dev')