python-advanced
Advanced feature
lambda
lambda is like a function(with return implicitly) with one statement, but no name(anonymous function), if you have more logic, use function instead, create a lambda means create a function object, then call it later on.
lambda features
- return implicitly
- one statement
- anonymous function
f = lambda arg1, arg2: expression
the result of the expression is return value.
1 | [expression for x in list condition] |
Most use case of lambda is like this.`
- some API parameters need a function
- the function is small In this case, use lambda instead of a function.
map(), filter(), reduce()
all return a new list after applying function for each element.
map()
Map applies a function to all the items in an input_list
. Here is the blueprint:
map(function_to_apply_must_return_value_with_elm_as_input, list_of_inputs)
filter()
As the name suggests, filter creates a list of elements for which a function returns true.filter(filter_function_must_return_true_or_false_with_elm_as_input, list_of_inputs)
reduce()
Reduce is a really useful function for performing some computation on a list and returning the result. It applies a rolling computation to sequential pairs of values in a list, it passed the previous result as the second parameter!
reduce(fun_return_value_with_elm_and_previous_result_as_input, list_of_inputs)
1 | def lambda_demo(): |
27
org: [1, 2, 3, 4, 5]
map(): [1, 4, 9, 16, 25]
org: [-5, -4, -3, -2, -1, 0, 1, 2, 3, 4]
filter(): [-5, -4, -3, -2, -1]
24
10
retry
Sometimes function needs to run again when it fails, that’s retry, you can use function from python retrying module, it provides a decorator that can do retrying, just add this decorator to your function that needs retrying.
1 | # pip3 install retrying |
you should see me twice: 0
you should see me twice: 1
daemon
In some case, app needs to run as a daemon, python-daemon
provides us such function, in order to use it, first install pip3 install python-daemon
main.py
1 | # pip3 install python-daemon |
Run it this way
1 | ./main.py |
Run shell command
sometimes we need to run shell command from python, there are several APIs you can use to run shell command.
These APIs can be grouped into two types
- one is
with shell output
, - the other is
without shell output returned
.
shell output returned as value
Old APIs
- getoutput()
- getstatusoutput()
parameter
these two methods implicitly are invoked with shell mode!, that means parameter is 'ls -al' not ['ls', '-al']
result:
getoutput() returns a str of command output or error message like '/bin/sh: 1: lh: not found'
getstatusoutput() returns a tuple (returncode, output), return code 0 means good!
New API
- check_output()–>
waits until all output is read. If ssh inherits the pipe then check_output() will wait until it exits (until it closes its inherited pipe ends).
return value: output in bytes(utf-8)
Raise exception if return code is non-zero otherwise output with bytes
, it more like getoutput()
, but getoutput() doesn't raise exception if fails
.
no shell output returned but print it to console
Old API
- call()
New API
- check_call()–>
returns as soon as /bin/sh process exits without waiting for descendant processes, fast
These two methods do not store the output, only print it out inside library(may be to console or file depends on current STD)!!! it only returns code with 0 if success to run, otherwise raises exception for check_xx version, catch the exception(CalledProcessError
) to see the return code and output attribute in it
check_call() does the same thing as call(), but check_call() will throw exception when command fails to run, so that you can see where the problem is!!! so always use check_call().
paramter
these API can run with two modes, shell enabled or disabled, with shell enabled, the parameter is a str
, otherwise it’s a list [] like this.
when shell=True, it uses /bin/sh
to run command, /bin/sh
does not support brace expansion($echo ‘hi ‘{jason,josh})
1 | call(['ls', '-al']) |
result of check_call() or check_output()
exception subprocess.CalledProcessError
Subclass of SubprocessError, raised when a process run by check_call() or check_output() returns a non-zero exit status.
returncode
Exit status of the child process. If the process exited due to a signal, this will be the negative signal number.
cmd
Command that was used to spawn the child process.
output
Output of the child process if it was captured by run() or check_output(). Otherwise, None.
stdout
Alias for output, for symmetry with stderr.
stderr
Stderr output of the child process if it was captured by run(). Otherwise, None.
1 | import subprocess |
getoutput('ls /bin/ls') show below
/bin/ls
getstatus('/bin/ls') show below
(0, 'algorithm\nclass_advanced.ipynb\ndebug\nfunction_advanced.ipynb\nhello.txt\npython3_advanced.ipynb\npython3_basic.ipynb\npython3_lib.ipynb\nreadme_pip.ipynb\nsamples')
check_output('/bin/ls', shell=True) show below
algorithm
class_advanced.ipynb
debug
function_advanced.ipynb
hello.txt
python3_advanced.ipynb
python3_basic.ipynb
python3_lib.ipynb
readme_pip.ipynb
samples
exception: check_output() not found command
no exception getoutput() /bin/sh: /bin/not_found: No such file or directory
/bin/sh: /bin/not_found: No such file or directory
Deep copy and shallow copy of list or dict
A shallow copy means constructing a new collection object and then populating it with references to the child objects found in the original
. The copying process does not recurse
and therefore will not create copies of the child objects themselves. In case of shallow copy, a reference of object is copied in other object, it means that any changes made to the copied object do reflect in the original object.
reference only works for object(like list, dict, class object) but for integer etc
Create shallow copy
1 | # use copy library |
Create deep copy
1 | import copy |
shallow copy
1 | dt = {'a': [1, 2]} |
before creating new dict, org dict: {'a': [1, 2]}
new dict {'a': [1, 2, 3]}
org dict {'a': [1, 2, 3]}
before creating new list, org list: [1, [2, 3]]
new list: [1, [5, 3]]
org list: [1, [5, 3]]
Deep copy
1 | import copy |
orignal dict: {'a': [1, 2]}
copied dict after deepcopy is: {'a': [1, 2]}
added a item in orignal dict: {'a': [1, 2, 3]}
copied dict now is: {'a': [1, 2]}
any/all function
all(iterable object) returns True if all values in the iterable match bool(elm)==True
while any(iterable object) returns True if any value in the iterable matches bool(elm)==True.
- all(empty_iterable) == True
- any(empty_iterable) == False
One use case for all() is to check if an iterable object includes another.
1 | def any_all_demo(): |
any([])= False
any([0])= False
any([1])= True
all(())= True
all((1,))= True
all([0])= False
[True, True]
list2 contains list1
CSV file
If file type is csv, you can use import csv
or import pandas
to access it
use pandas
always as it’s powerful!
csv file is text file but with fixed format, each row should have same columns and separated by a 'separator'
1 | import os |
pandas read test.csv
a_name b_name
0 1 4
1 2 5
2 3 6
pandas read test.csv only values:
[[1 4]
[2 5]
[3 6]]
serialization class, tuple, list, dict
In some case we want to save our data(object) into a file with binary(not json)
, as it’s small, then read it back into object that we use before, use cloudpickle which extends pickle
, it supports serialization of many objects, class, tuple, list, dict
etc.
1 | def save_object_into_file(): |
write to file with data: ['a', 'b']
read data back: ['a', 'b']
show trace with try/except
Without try/except
, trace is printed when exception happens, but if we want to see trace even we catch the exception, use traceback(new traceback2)
.
1 | def show_traceback_even_with_exception(): |
------------traceback is printed
Traceback (most recent call last):
File "/tmp/ipykernel_2840/2931145062.py", line 6, in show_traceback_even_with_exception
ZeroDivisionError: division by zero
------------traceback is printed
package and module
- A package is dir with
__init__.py
under it - a module is a python file like
request.py
The most use case is to group related modules within a package then use them, the struct may look like this.
1 | `-- common |
after import you may use xxx.yyy (yyy must be a symbol of xxx.py or symbol of xxx/_init_.py)
1 | import package_name |
what content of _init_.py?
Idealy, you can write anything in it, it's just a python file that is called when import the package
, but mostly we write it with below content.
Examples of _init_.py
- empty, just required by python
1 | #__init__.py |
1 | # test.py to show how use module |
- wrapper modules API
1 | # __init__.py |
1 | # test.py |
- wrapper module API and define package level API as well
1 | # __init__.py |
1 | # test.py |
access module from parent directory
1 | `-- app |
Add module to sys.path
1 | # file test_app.py |
Use relative path
1 | # file test_app.py |
inside with
statement
class-based support with()
when “with” statement is executed, Python evaluates the expression, calls the __enter__ method on the resulting value (which is called a "context guard") and assigns whatever __enter__ returns to the variable given by as
. Python will then execute the code body and no matter what happens in that code, call the guard object’s exit method`
This interface of enter() and exit() methods which provides the support of with statement in user defined objects is called Context Manager
.
1 | class ctl: |
function-based: contextlib module to support with()
A class based context manager as shown above(__enter__, __exit__) is not the only way
to support the with statement in user defined objects. The contextlib module
provides a few more abstractions built upon the basic context manager interface.
1 | from contextlib import contextmanager |
with for file and lock
As it reduces user work for progamming, hence some builtin object like file and lock, condition from threading
support context manager, hence for file and lock always use with statement
.
lock/condition from threading
1 | impor threading |
open file
1 | # without using with statement |
schedule job
APScheduler has four kinds of components:
- triggers
- job stores
- executors
- schedulers
Triggers
contain the scheduling logic. Each job has its own trigger which determines when the job should be run next. Beyond their initial configuration, triggers are completely stateless.
Job stores
house the scheduled jobs. The default job store simply keeps the jobs in memory, but others store them in various kinds of databases. A job’s data is serialized when it is saved to a persistent job store, and deserialized when it’s loaded back from it. Job stores (other than the default one) don’t keep the job data in memory, but act as middlemen for saving, loading, updating and searching jobs in the backend. Job stores must never be shared between schedulers.
Executors
are what handle the running of the jobs. They do this typically by submitting the designated callable in a job to a thread or process pool. When the job is done, the executor notifies the scheduler which then emits an appropriate event.
Schedulers
are what bind the rest together. You typically have only one scheduler running in your application. The application developer doesn’t normally deal with the job stores, executors or triggers directly. Instead, the scheduler provides the proper interface to handle all those. Configuring the job stores and executors is done through the scheduler, as is adding, modifying and removing jobs.
choose trigger
When you schedule a job, you need to choose a trigger for it. The trigger determines the logic by which the dates/times are calculated when the job will be run. APScheduler comes with three built-in trigger types:
date
: use when you want to run the job just once at a certain point of timeinterval
: use when you want to run the job at fixed intervals of timecron
: use when you want to run the job periodically at certain time(s) of day
choose scheduler
Your choice of scheduler depends mostly on your programming environment and what you’ll be using APScheduler for. Here’s a quick guide for choosing a scheduler:
BlockingScheduler
: use when the scheduler is the only thing running in your processBackgroundScheduler
: use when you’re not using any of the frameworks below, and want the scheduler to run in the background inside your applicationAsyncIOScheduler
: use if your application uses the asyncio moduleGeventScheduler
: use if your application uses gevent
Usage
start scheduler
Starting the scheduler is done by simply calling start() on the scheduler. For schedulers other than BlockingScheduler, this call will return immediately
and you can continue the initialization process of your application, possibly adding jobs to the scheduler.
For BlockingScheduler, you will only want to call start() after you’re done with any initialization steps.
Add a job
There are two ways to add jobs to a scheduler
- by calling add_job()
- by decorating a function with scheduled_job()
1 | # args = [name] |
The first way is the most common way to do it. The second way is mostly a convenience to declare jobs that don’t change during the application’s run time. The add_job() method returns a apscheduler.job.Job instance that you can use to modify or remove the job later.
1 | name = 'jack' |
Remove a job
When you remove a job from the scheduler, it is removed from its associated job store and will not be executed anymore. There are two ways to make this happen:
- by calling remove_job() with the job’s ID and job store alias
- by calling remove() on the Job instance you got from add_job()
1 | job = scheduler.add_job(myfunc, 'interval', minutes=2) |
list all jobs
Getting a list of scheduled jobs
To get a machine processable list of the scheduled jobs, you can use the get_jobs() method. It will return a list of Job instances.
Cron job
Parameter for cron type
1 | # Schedules job_function to be run on the third Friday |
- year (int|str) – 4-digit year
- month (int|str) – month (1-12)
- day (int|str) – day of month (1-31)
- week (int|str) – ISO week (1-53)
- day_of_week (int|str) – number or name of weekday (0-6 or mon,tue,wed,thu,fri,sat,sun)
- hour (int|str) – hour (0-23)
- minute (int|str) – minute (0-59)
- second (int|str) – second (0-59)
- start_date (datetime|str) – earliest possible date/time to trigger on (inclusive)
- end_date (datetime|str) – latest possible date/time to trigger on (inclusive)
- timezone (datetime.tzinfo|str) – time zone to use for the date/time calculations (defaults to scheduler timezone)
- jitter (int|None) – delay the job execution by jitter seconds at most
document and schedule framework
use mongodb as default job store
By default, apscheduler uses memory as job store, that means jobs are lost when it restarts, for persistent job store, now it supports sql, mongodb, redis etc.
1 | from apscheduler.jobstores.mongodb import MongoDBJobStore |
1 | #! /usr/bin/env python3 |
<Job (id=91ed6da1d3ba41e6b6169d329243687b name=job)>
types
Python has support for optional “type hints” from python3.6.
These “type hints” are a special syntax that allow declaring the type of a variable, function, parameters etc
By declaring types for your variables, editors and tools can give you better support.
1 | from typing import List, Set, Dict, Tuple, Optional |
CLI Framework
For good application, it’s better provide good CLI for user, so that user can know how to use it with help, python provides a good built-in parser argparse, but there are other frameworks as well which is better than argparse, python-fire is the best on to go which is developed by Google, while click is also another good choice which is suitable to understand and auto generate help!!
python-fire
Python Fire is a library for automatically generating command-line interfaces from absolutely any Python object. It can help debug Python code more easily from the command line, create CLI interfaces to existing code, allow you to interactively explore code in a REPL, and simplify transitioning between Python and Bash (or any other shell).
You can call Fire on any Python object:functions, classes, modules, objects, dictionaries, lists, tuples, etc. They all work!
click (use me)
Recommended Way
Click is a Python package for creating beautiful command line interfaces in a composable way with as little code as necessary. It’s the “Command Line Interface Creation Kit”. It’s highly configurable but comes with sensible defaults out of the box.
It aims to make the process of writing command line tools quick and fun while also preventing any frustration caused by the inability to implement an intended CLI API.
Click in three points:
Arbitrary nesting of commands
Automatic help page generation
Supports lazy loading of subcommands at runtime
- click options
Only one command
1 | import click |
1 | $ ./cli.py --help |
Multiple commands
1 | import click |
1 | $./t.py |
argparse
Three steps to use this library
- initialize the parser by argparse.ArgumentParser
- add argument(keyword) by named argument: parser.add_argument(‘–port’) or positional argument: parser.add_argument(‘port’)
- get the parsed args by parser.parse_args() then do more
Here is a basic example for it.
app.py
1 |
|
Some tips
zip()
creates pairs from two sequences.
1 | def step_two_sequences_pars(): |
The capital of Japan is Tokyo
The capital of Korea is Seoul
The capital of China is Beijing
better code snips
1 | # create a new list based on condition from a list |
two prints in the same line
1 | import time |
boylo