with open('data-dev/jsondict-test.json', 'r') as fp:
= json.load(fp)
d
d
d.items()
dict_items([('b', 2), ('c', 3), ('d', 4)])
core
List of fastcore
object to load to use once eccore
is imported
from basics:
gen
, chunked
NS
, SimpleNameSpace
store_attr
getattrs
, hasattrs
, setattrs
listify
, tuplify
, setify
, uniqueify
, stringify
…)filter_x
, argwhere
,val2idx
patch_to
PrettyString
ipython_shell
, in_ipython
, in_colab
, in_jupyter
, in_notebook
from foundation:
L
Config
from Utility functions:
untar_dir
run
Path
with added capabilitiesfrom Meta:
delegates
, use_kwargs
, funcs_kwargs
, method
Classes to handle data structure more easily
dict_items([('b', 2), ('c', 3), ('d', 4)])
JsonDict (p2json:str|pathlib.Path, dictionary:Optional[dict]=None)
*Dictionary whose current value is mirrored in a json file and can be initated from a json file
JsonDict
requires a path to json file at creation. An optional dict can be passed as argument.
Behavior at creation:
JsonDict(p2json, dict)
will create a JsonDict
with key-values from dict
, and mirrored in p2json
JsonDict(p2json)
will create a JsonDict
with empty dictionary and load json content if file existsOnce created, JsonDict
instances behave exactly as a dictionary*
Type | Default | Details | |
---|---|---|---|
p2json | str | pathlib.Path | path to the json file to mirror with the dictionary | |
dictionary | Optional | None | optional dictionary to initialize the JsonDict |
Create a new dictionary mirrored to a JSON file:
d = {'a': 1, 'b': 2, 'c': 3}
p2json = Path('data-dev/jsondict-test.json')
jsond = JsonDict(p2json, d)
jsond
{'a': 1, 'b': 2, 'c': 3}
dict mirrored in /home/vtec/projects/ec-packages/eccore/nbs-dev/data-dev/jsondict-test.json
Once created, the JsonFile
instance behaves exactly like a dictionary, with the added benefit that any change to the dictionary is automatically saved to the JSON file.
key: a; value: 1
key: b; value: 2
key: c; value: 3
Adding or removing a value from the dictionary works in the same way as for a normal dictionary. But the json file is automatically updated.
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
dict mirrored in /home/vtec/projects/ec-packages/eccore/nbs-dev/data-dev/jsondict-test.json
{'b': 2, 'c': 3, 'd': 4}
dict mirrored in /home/vtec/projects/ec-packages/eccore/nbs-dev/data-dev/jsondict-test.json
is_type (obj:Any, obj_type:type, raise_error:bool=False)
Validate that obj
is of type obj_type
. Raise error in the negative when raise_error
is True
Type | Default | Details | |
---|---|---|---|
obj | Any | object whose type to validate | |
obj_type | type | expected type for obj |
|
raise_error | bool | False | when True, raise a ValueError is obj is not of the right type |
Returns | bool | True when obj is of the right type, False otherwise |
Functions to ensure path are properly formated and point to a real file or directory.
validate_path (path:str|pathlib.Path, path_type:str='file', raise_error:bool=False)
Validate that path is a Path
or str
and points to a real file or directory
Type | Default | Details | |
---|---|---|---|
path | str | pathlib.Path | path to validate | |
path_type | str | file | type of the target path: 'file' , 'dir' or 'any' |
raise_error | bool | False | when True, raise a ValueError is path does not a file |
Returns | bool | True when path is a valid path, False otherwise |
safe_path (path:str|pathlib.Path)
*Return a Path
object when given a valid path as a str
or a Path
, raise error otherwise
Note: This function does not check whether the file or directory exists.*
Type | Details | |
---|---|---|
path | str | pathlib.Path | path to validate |
Returns | Path | validated path returned as a pathlib.Path |
get_config_value (section:str, key:str, path_to_config_file:Union[pathlib.Path,str,NoneType]=No ne)
*Returns the value corresponding to the key-value pair in the configuration file (configparser format)
When no path_to_config_file is provided, the function will try to find the file in: the system’s home
, the parent directory of the current directory, and the Google drive directory mounted to the Colab environment.*
Type | Default | Details | |
---|---|---|---|
section | str | section in the configparser cfg file | |
key | str | key in the selected section | |
path_to_config_file | Union | None | path to the cfg file |
Returns | Any | the value corresponding to section>key>value |
By defaults (path_to_config_file is None
), it is assumed that the configuration file is located in: - the local package config directory (home/.eccore/) - the working directory - the folder above the working directory - the private-accross-accounts directory
on google drive.
File names are expected to be either config-api-keys.cfg
or config-sample.cfg
.
If not, a path to the file (Path
or str
) must be provided.
The configuration file is expected to be in the format used by the standard module configparser
documentation
[DEFAULT]
key = value
[section_name]
key = value
[section_name]
key = value
found /home/vtec/projects/ec-packages/eccore/config-api-keys.cfg
Using config file at /home/vtec/projects/ec-packages/eccore/config-api-keys.cfg
'Etienne Charlier'
path2cfg = Path('../config-sample.cfg').resolve()
assert path2cfg.is_file(), f"{path2cfg} is not a file"
print(path2cfg.absolute())
with open(path2cfg, 'r') as fp:
print(fp.read())
/home/vtec/projects/ec-packages/eccore/config-sample.cfg
[azure]
azure-api-key= dummy_api_key_for_azure
[github]
git_name = not_my_real_github_name
git_email = not_my_real_git_email
github_username = not_my_real_git_username
[kaggle]
kaggle_username = not_my_real_kaggle_name
kaggle_key = dummy_api_key_for_kaggle
[wandb]
api_key = dummy_api_key_for_wandb
value = get_config_value(section='azure', key='azure-api-key', path_to_config_file=path2cfg)
assert value == 'dummy_api_key_for_azure'
Using config file at /home/vtec/projects/ec-packages/eccore/config-sample.cfg
value = get_config_value(section='kaggle', key='kaggle_username', path_to_config_file=path2cfg)
assert value == 'not_my_real_kaggle_name'
Using config file at /home/vtec/projects/ec-packages/eccore/config-sample.cfg
CurrentMachine (*args, **kwargs)
*Callable class representing the current machine. When called, instance return a dict all attrs
:
os
: the operating system running on the machinehome
: path to home on the machineis_local
, is_colab
, is_kaggle
: whether the machine is running locally or notp2config
: path to the config filepackage_root
: path to the package root directoryCurrentMachine is a singleton class.*
{'os': 'linux',
'home': Path('/home/vtec'),
'is_local': True,
'is_colab': False,
'is_kaggle': False,
'p2config': Path('/home/vtec/.ecutilities/ecutilities.cfg'),
'package_root': Path('/home/vtec/projects/ec-packages/eccore')}
This machine is not registered a local machine, but is also not running in the cloud. We should register it as a local machine with register_as_local
CurrentMachine.register_as_local ()
Update the configuration file to register the machine as local machine
Use this method to register the current machine as local machine. Only needs to be used once on a machine. Do not use on cloud VMs
(True, False, False)
Technical Note:
The configuration file is located at a standard location, which varies depending on the OS:
- Windows:
- home is
C:\Users\username
- application data in
C:\Users\username\AppData/Local/...
orC:\Users\username\AppData\Roaming\...
(see StackExchange)- application also can be loaded under a dedicated directory under
C:\Users\username
likeC:\Users\username\.conda\...
- Linux:
- home is
/home/username
- application data in a file or dedicated directory
/home/username/
s.a.:
- file in home directory, e.g.
.gitconfig
- file in an application dedicated directory, e.g.
/home/username/.conda/...
ecutilities
places the configuration file in a dedicated directory in the home directory: -C:\Users\username\.ecutilities\ecutilities.cfg
-/home/username/.ecutilities/ecutilities.cfg
Retrieve the OS:
win32 with Windows linux with linux darwin with macOs
Accessing the correct path depending on the OS:
WindowsPath('C:/Users/username') with Windows Path('/home/username') with linux
ProjectFileSystem (*args, **kwargs)
*Class representing the project file system and key subfolders (data, nbs, src)
Set paths to key directories, according to whether the code is running locally or in the cloud. Give access to path to these key folders and information about the environment.*
{'os': 'linux',
'home': Path('/home/vtec'),
'is_local': True,
'is_colab': False,
'is_kaggle': False,
'p2config': Path('/home/vtec/.ecutilities/ecutilities.cfg'),
'package_root': Path('/home/vtec/projects/ec-packages/eccore')}
ProjectFileSystem.create_project_file_system (p2project_root, overwrite=False)
*Create a standard project file system with the following structure:
project_root
|--- data all data files
|--- nbs all notebooks for work and experiments
|--- src all scripts and code
```*
| | **Type** | **Default** | **Details** |
| -- | -------- | ----------- | ----------- |
| p2project_root | | | path to project root, where all subfolder will be located |
| overwrite | bool | False | overwrite current folders if they exist when True (not implemented yet) |
::: {#efcea5b2 .cell}
``` {.python .cell-code}
pfs.create_project_file_system(Path('/home/vtec/projects/ec-packages/eccore'))
/home/vtec/projects/ec-packages/eccore/data
/home/vtec/projects/ec-packages/eccore/nbs
/home/vtec/projects/ec-packages/eccore/src
Created project file system in /home/vtec/projects/ec-packages/eccore
:::
setup_logging (logfile:pathlib.Path|None=None)
Setup logging to console and to file if logfile is not None
logthis (*args)
Logs all elements passed to logs
monitor_fn (fn)
Highlights when function in entered to and exited from
After setting up the logging, it is easy to create log entries:
Logging to console and to /home/vtec/projects/ec-packages/eccore/nbs-dev/data-dev/dev.log.
Logging setup finished
logging.info('Logging manually as info, only shows in logfile')
logging.warning('Logging manually as warning, shows in logfile and console')
logthis('Using log function, as info, only shows in the log file')
2025-04-27 20:45:42: Logging manually as warning, shows in logfile and console
See logfile content:
if p2log.exists():
print('Log file content:')
with open(p2log, 'r') as f:
print(''.join(f.readlines()))
Log file content:
2025-04-27 20:45:42: Logging manually as info, only shows in logfile
2025-04-27 20:45:42: Logging manually as warning, shows in logfile and console
2025-04-27 20:45:42: Using log function, as info, only shows in the log file
Create a function decorated with @monitor_fn
to monitor function calls, i.e. when function in entered and exited.
@monitor_fn
def a_function(a,b):
"""Test functions to add two numbers"""
return a + b
print(f"function output is {a_function(1,2)}")
print(f"")
function output is 3
if p2log.exists():
print('Log file content:')
with open(p2log, 'r') as f:
print(''.join(f.readlines()))
p2log.unlink()
Log file content:
2025-04-27 20:45:42: Logging manually as info, only shows in logfile
2025-04-27 20:45:42: Logging manually as warning, shows in logfile and console
2025-04-27 20:45:42: Using log function, as info, only shows in the log file
2025-04-27 20:46:59: Entering `a_function`
2025-04-27 20:46:59: Exiting `a_function`
files_in_tree (path:str|pathlib.Path, pattern:str|None=None)
List files in directory and its subdiretories, print tree starting from parent directory
Type | Default | Details | |
---|---|---|---|
path | str | pathlib.Path | path to the directory to scan | |
pattern | str | None | None | pattern (glob style) to match in file name to filter the content |
p2dir = Path('').resolve()
print(p2dir, '\n')
files = files_in_tree(p2dir)
print(f"List of {len(files)} files when unfiltered")
/home/vtec/projects/ec-packages/eccore/nbs-dev
eccore
|--nbs-dev
| |--0_02_plotting.ipynb (0)
| |--0_01_ipython.ipynb (1)
| |--0_00_core.ipynb (2)
| |--.last_checked (3)
| |--sidebar.yml (4)
| |--index.ipynb (5)
| |--nbdev.yml (6)
| |--9_01_dev_utils.ipynb (7)
| |--styles.css (8)
| |--_quarto.yml (9)
| |--data-dev
| | |--jsondict-test.json (10)
| | |--ten-blobs-6-cols-clusters.npy (11)
| | |--ten-blobs-6-cols-y.npy (12)
| | |--ten-blobs-6-cols-X.npy (13)
List of 14 files when unfiltered
Use pattern
to filter the paths to return (using glob
syntax)
eccore
|--nbs-dev
| |--0_02_plotting.ipynb (0)
| |--0_01_ipython.ipynb (1)
| |--0_00_core.ipynb (2)
| |--index.ipynb (3)
| |--9_01_dev_utils.ipynb (4)
| |--data-dev
List of 5 files when filtered
path_to_parent_dir (pattern:str, path:str|pathlib.Path|None=None)
*Climb directory tree up to a directory starting with pattern
, and return its path.
When no directory is found in the tree starting with pattern
, return the current directory path.
It is possible to pass a path
as starting path to climb from.*
Type | Default | Details | |
---|---|---|---|
pattern | str | pattern to identify the parent directory | |
path | str | pathlib.Path | None | None | optional path from where to seek for parent directory |
Returns | Path | path of the parent directory |
p2dir = path_to_parent_dir('nbs')
assert 'nbs-dev' in p2dir.parts and 'nbs' not in p2dir.parts
p2dir
Path('/home/vtec/projects/ec-packages/eccore/nbs-dev')