python-advanced

Advanced feature

lambda

lambda is like a function(with return implicitly) with one statement, but no name(anonymous function), if you have more logic, use function instead, create a lambda means create a function object, then call it later on.


lambda features

  • return implicitly
  • one statement
  • anonymous function

f = lambda arg1, arg2: expression the result of the expression is return value.

1
2
3
[expression for x in list condition]
# save result of expression to a list if condition matches
# expression can be simple or lambda!!!

Most use case of lambda is like this.`

  • some API parameters need a function
  • the function is small In this case, use lambda instead of a function.

map(), filter(), reduce() all return a new list after applying function for each element.

map()

Map applies a function to all the items in an input_list. Here is the blueprint:

map(function_to_apply_must_return_value_with_elm_as_input, list_of_inputs)

filter()

As the name suggests, filter creates a list of elements for which a function returns true.
filter(filter_function_must_return_true_or_false_with_elm_as_input, list_of_inputs)

reduce()

Reduce is a really useful function for performing some computation on a list and returning the result. It applies a rolling computation to sequential pairs of values in a list, it passed the previous result as the second parameter!

reduce(fun_return_value_with_elm_and_previous_result_as_input, list_of_inputs)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
def lambda_demo():
from functools import reduce
#function created by lambda is said to be anonymous because it was never given a name
f = lambda x: x**3
print(f(3))

items = [1, 2, 3, 4, 5]
print("org:", items)
squared = list(map(lambda x: x**2, items))
print("map():",squared)

number_list = range(-5, 5)
print("org:", list(number_list))
less_than_zero = list(filter(lambda x: x < 0, number_list))
print("filter():",less_than_zero)

list1 = [1, 2, 3, 4]
# y is the result of previous op, x is the element of list1.
print(reduce(lambda x, y: x * y, list1))
print(reduce(lambda x, y: x + y, list1))

lambda_demo()
27
org: [1, 2, 3, 4, 5]
map(): [1, 4, 9, 16, 25]
org: [-5, -4, -3, -2, -1, 0, 1, 2, 3, 4]
filter(): [-5, -4, -3, -2, -1]
24
10

retry

Sometimes function needs to run again when it fails, that’s retry, you can use function from python retrying module, it provides a decorator that can do retrying, just add this decorator to your function that needs retrying.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# pip3 install retrying
def retry_demo():
from retrying import retry

# by default, retry happens when exception happens,but can be customized.
@retry(stop_max_attempt_number=2, wait_random_min=1000, wait_random_max=2000)
def run():
print('you should see me twice')
raise NameError("hi")


def retry_if_result_not_2(result):
return result != 2

# retry by checking result of run2
result = 0
@retry(retry_on_result=retry_if_result_not_2, wait_random_min=1000, wait_random_max=2000)
def run2():
nonlocal result
print('you should see me twice:', result)
if result != 2:
result = result + 1

return result

run2()
retry_demo()
you should see me twice: 0
you should see me twice: 1

daemon

In some case, app needs to run as a daemon, python-daemon provides us such function, in order to use it, first install pip3 install python-daemon

main.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# pip3 install python-daemon
#! /usr/bin/env python3
import daemon


class App:
def __init__(self):
pass

def start(self):
with daemon.DaemonContext():
# fork a new process
self.run()

def run(self):
time.sleep(10000)
# print is not to console as it's a daemon!!!
# print("I runs as a daemon")


app = App()
app.start()

Run it this way

1
2
$./main.py
$ ps -ef | grep main.py

Run shell command

sometimes we need to run shell command from python, there are several APIs you can use to run shell command.

These APIs can be grouped into two types

  • one is with shell output,
  • the other is without shell output returned.

shell output returned as value

Old APIs

  • getoutput()
  • getstatusoutput()

parameter

these two methods implicitly are invoked with shell mode!, that means parameter is 'ls -al' not ['ls', '-al']

result:

getoutput() returns a str of command output or error message like '/bin/sh: 1: lh: not found'

getstatusoutput() returns a tuple (returncode, output), return code 0 means good!

New API

  • check_output()–>waits until all output is read. If ssh inherits the pipe then check_output() will wait until it exits (until it closes its inherited pipe ends).

return value: output in bytes(utf-8)

Raise exception if return code is non-zero otherwise output with bytes, it more like getoutput(), but getoutput() doesn't raise exception if fails.

no shell output returned but print it to console

Old API

  • call()

New API

  • check_call()–>returns as soon as /bin/sh process exits without waiting for descendant processes, fast

These two methods do not store the output, only print it out inside library(may be to console or file depends on current STD)!!! it only returns code with 0 if success to run, otherwise raises exception for check_xx version, catch the exception(CalledProcessError) to see the return code and output attribute in it

check_call() does the same thing as call(), but check_call() will throw exception when command fails to run, so that you can see where the problem is!!! so always use check_call().

paramter

these API can run with two modes, shell enabled or disabled, with shell enabled, the parameter is a str, otherwise it’s a list [] like this.

when shell=True, it uses /bin/sh to run command, /bin/sh does not support brace expansion($echo ‘hi ‘{jason,josh})

1
2
3
4
call(['ls', '-al'])
check_call('ls -al', shell=True)

check_call(cmd, shell=True, executable='/bin/bash') # use bash

result of check_call() or check_output()

exception subprocess.CalledProcessError

Subclass of SubprocessError, raised when a process run by check_call() or check_output() returns a non-zero exit status.

returncode

Exit status of the child process. If the process exited due to a signal, this will be the negative signal number.

cmd

Command that was used to spawn the child process.

output

Output of the child process if it was captured by run() or check_output(). Otherwise, None.

stdout

Alias for output, for symmetry with stderr.

stderr

Stderr output of the child process if it was captured by run(). Otherwise, None.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import subprocess

print("getoutput('ls /bin/ls') show below")
print(subprocess.getoutput('ls /bin/ls'))

print("getstatus('/bin/ls') show below")
print(subprocess.getstatusoutput('/bin/ls'))
print("check_output('/bin/ls', shell=True) show below")

# need to decode return value by check_output
print(subprocess.check_output('/bin/ls', shell=True).decode('utf8'))

try:
subprocess.check_output('/bin/not_found', shell=True)
except subprocess.CalledProcessError:
print('exception: check_output() not found command')

print('no exception getoutput()', subprocess.getoutput('/bin/not_found')) # no exception
getoutput('ls /bin/ls') show below
/bin/ls
getstatus('/bin/ls') show below
(0, 'algorithm\nclass_advanced.ipynb\ndebug\nfunction_advanced.ipynb\nhello.txt\npython3_advanced.ipynb\npython3_basic.ipynb\npython3_lib.ipynb\nreadme_pip.ipynb\nsamples')
check_output('/bin/ls', shell=True) show below
algorithm
class_advanced.ipynb
debug
function_advanced.ipynb
hello.txt
python3_advanced.ipynb
python3_basic.ipynb
python3_lib.ipynb
readme_pip.ipynb
samples

exception: check_output() not found command
no exception getoutput() /bin/sh: /bin/not_found: No such file or directory


/bin/sh: /bin/not_found: No such file or directory

Deep copy and shallow copy of list or dict

A shallow copy means constructing a new collection object and then populating it with references to the child objects found in the original. The copying process does not recurse and therefore will not create copies of the child objects themselves. In case of shallow copy, a reference of object is copied in other object, it means that any changes made to the copied object do reflect in the original object.

reference only works for object(like list, dict, class object) but for integer etc

Create shallow copy

1
2
3
4
5
6
7
8
9
# use copy library
import copy
# shallow copy, no recurse for embedded object!!!
newlt = copy.copy(list)
newdt = copy.copy(dict)

# shallow copy(another way)
newlt = list1[:]
newdt = dict.copy()

Create deep copy

1
2
3
4
import copy
# recurse for embed object!!!
copy.deepcopy(list)
copy.deepcopy(dict)

shallow copy

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
dt = {'a': [1, 2]}
print('before creating new dict, org dict: ', dt)

# shadow copy
ndt = dt.copy()

ndt['a'].append(3)
print('new dict ', ndt)
print('org dict', dt)

lt = [1, [2, 3]]
print('before creating new list, org list: ', lt)

# shadow copy
nlt = lt[:]

nlt[1][0] = 5
print('new list: ', nlt)
print('org list: ', lt)
before creating new dict, org dict:  {'a': [1, 2]}
new dict  {'a': [1, 2, 3]}
org dict {'a': [1, 2, 3]}
before creating new list, org list:  [1, [2, 3]]
new list:  [1, [5, 3]]
org list:  [1, [5, 3]]

Deep copy

1
2
3
4
5
6
7
8
9
import copy

dt = {'a': [1, 2]}
dtcc = copy.deepcopy(dt) # deep copy.
print("orignal dict: ", dt)
print("copied dict after deepcopy is: ", dtcc)
dt['a'].append(3) # dtcc doesn't change!!!!
print("added a item in orignal dict: ", dt)
print("copied dict now is: ", dtcc)
orignal dict:  {'a': [1, 2]}
copied dict after deepcopy is:  {'a': [1, 2]}
added a item in orignal dict:  {'a': [1, 2, 3]}
copied dict now is:  {'a': [1, 2]}

any/all function

all(iterable object) returns True if all values in the iterable match bool(elm)==True
while any(iterable object) returns True if any value in the iterable matches bool(elm)==True.

  • all(empty_iterable) == True
  • any(empty_iterable) == False

One use case for all() is to check if an iterable object includes another.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def any_all_demo():
print("any([])=", any([]))
print("any([0])=", any([0]))
print("any([1])=", any([1]))

print("all(())=", all(()))
print("all((1,))=", all((1,)))
print("all([0])=", all([0]))

any_all_demo()

def check_list_container_another():
list1 = [1, 2]
list2 = [1, 2, 3]
# for..in to loop
# in to test
result = [elm in list2 for elm in list1]
print(result)
if all(result):
print('list2 contains list1')
else:
print('list2 doest not contain list1')

check_list_container_another()
any([])= False
any([0])= False
any([1])= True
all(())= True
all((1,))= True
all([0])= False
[True, True]
list2 contains list1

CSV file

If file type is csv, you can use import csv or import pandas to access it

use pandas always as it’s powerful!

csv file is text file but with fixed format, each row should have same columns and separated by a 'separator'

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import os
import pandas as pd

def pandas_csv():
# 任意的多组列表
a = [1, 2, 3]
b = [4, 5, 6]

# 字典中的key值即为csv中列名
dataframe = pd.DataFrame({'a_name': a, 'b_name': b})

# 将DataFrame存储为csv,index表示是否显示行名,default=True
# dataframe.to_csv("test.csv", index=False, sep=',', mode='a') a-->append w-->write
dataframe.to_csv("test.csv", index=False, sep=',', mode='w')
'''
a_name,b_name
1, 4
2, 5
3, 6
'''
data = pd.read_csv('test.csv')

print('pandas read test.csv\n', data)
print('pandas read test.csv only values:\n', data.values)
os.remove('test.csv')

pandas_csv()
pandas read test.csv
    a_name  b_name
0       1       4
1       2       5
2       3       6
pandas read test.csv only values:
 [[1 4]
 [2 5]
 [3 6]]

serialization class, tuple, list, dict

In some case we want to save our data(object) into a file with binary(not json), as it’s small, then read it back into object that we use before, use cloudpickle which extends pickle, it supports serialization of many objects, class, tuple, list, dict etc.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def save_object_into_file():
import os
import cloudpickle


lt = ['a', 'b']

# write data to file with binary mode
with open('data_binary', 'wb') as fd:
cloudpickle.dump(lt, fd) # cloudpickle.dumps(lt) returns result of bytes not writing to file
# read it back
print("write to file with data: ", lt)
# must open with binary mode
with open('data_binary', 'rb') as fd:
lt = cloudpickle.load(fd) # cloudpickle.loads(bytes) loads from bytes
print('read data back: ', lt)

os.remove('data_binary')

save_object_into_file()
write to file with data:  ['a', 'b']
read data back:  ['a', 'b']

show trace with try/except

Without try/except, trace is printed when exception happens, but if we want to see trace even we catch the exception, use traceback(new traceback2).

1
2
3
4
5
6
7
8
9
10
11
12
def show_traceback_even_with_exception():
import sys
import traceback2

try:
a = 12 / 0
except Exception as error:
print('------------traceback is printed')
traceback2.print_exc(file=sys.stdout)
print('------------traceback is printed')

show_traceback_even_with_exception()
------------traceback is printed
Traceback (most recent call last):
  File "/tmp/ipykernel_2840/2931145062.py", line 6, in show_traceback_even_with_exception
ZeroDivisionError: division by zero
------------traceback is printed

package and module

  • A package is dir with __init__.py under it
  • a module is a python file like request.py

The most use case is to group related modules within a package then use them, the struct may look like this.

1
2
3
`-- common
|-- __init__.py
`-- request.py

after import you may use xxx.yyy (yyy must be a symbol of xxx.py or symbol of xxx/_init_.py)

1
2
3
4
5
6
7
8
9
10
import package_name
# then use this way: package_name.module.function

from package_name import module_name
# then use this way: module_name.function

import module_name
from module_name import function_xx
from module_name import var_xx
# then use this way: function_xx

what content of _init_.py?

Idealy, you can write anything in it, it's just a python file that is called when import the package, but mostly we write it with below content.

Examples of _init_.py

  • empty, just required by python
1
#__init__.py
1
2
3
4
# test.py to show how use module
# must use module with package prefix or module prefix!!!
from common import request
request.aaa
  • wrapper modules API
1
2
3
# __init__.py
from request import * # import all from request module.
from request import aaa # OR only import aaa for package
1
2
3
4
5
6
7
# test.py
# then you can access module's function by package name like below
# use it
import common
# as in common/__init__.py we import aaa from request
# so here just use common.aaa, not request.aaa, common seems a wrapper for its module!!!
common.aaa()
  • wrapper module API and define package level API as well
1
2
3
4
5
6
7
# __init__.py
from request import * # import all for package
from request import aaa # OR only import aaa for package

# package level API, it may call package api to wrapper them.
def pkg_aaa():
print("pkg_aaa")
1
2
3
4
5
# test.py
import common

common.aaa()
common.pkg_aaa()

access module from parent directory

1
2
3
4
5
6
7
`-- app
|-- app.py
|-- common
| |-- __init__.py
| `-- util.py
`-- tests
`-- test_app.py

Add module to sys.path

1
2
3
4
5
6
7
8
9
10
# file test_app.py
import os
import sys

currentdir = os.path.dirname(os.path.realpath(__file__))
parentdir = os.path.dirname(currentdir)
sys.path.append(parentdir)

import app
from common import util

Use relative path

1
2
3
4
5
# file test_app.py

import ..app
from ..common import util
import ..common.util

inside with statement

class-based support with()

when “with” statement is executed, Python evaluates the expression, calls the __enter__ method on the resulting value (which is called a "context guard") and assigns whatever __enter__ returns to the variable given by as. Python will then execute the code body and no matter what happens in that code, call the guard object’s exit method`

This interface of enter() and exit() methods which provides the support of with statement in user defined objects is called Context Manager.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class ctl:
def __enter__(self):
#set things up
return self

def __exit__(self, type, value, traceback):
# tear things down
return True # True means continue goes run even exception happens!!!!
# most of time, False used

def divide_zero(self):
bar = 1/0 # the execption will pass to __exit__
# if no __exit__, program quit, otherwise, if __exit__ return False
# exception, True --contine run


def with_as_example():
# like th = ctl().__enter__()
with ctl() as th:
th.divide_zero()
print('go here')

function-based: contextlib module to support with()

A class based context manager as shown above(__enter__, __exit__) is not the only way to support the with statement in user defined objects. The contextlib module provides a few more abstractions built upon the basic context manager interface.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from contextlib import contextmanager

class MessageWriter(object):
def __init__(self, filename):
self.file_name = filename

@contextmanager
def open_file(self):
try:
file = open(self.file_name, 'w')
yield file
finally:
file.close()

# usage
message_writer = MessageWriter('hello.txt')
with message_writer.open_file() as my_file:
my_file.write('hello world')

with for file and lock

As it reduces user work for progamming, hence some builtin object like file and lock, condition from threading support context manager, hence for file and lock always use with statement.

lock/condition from threading

1
2
3
4
5
6
7
8
9
10
11
12
13
14
impor threading

lock = threading.Lock()

# without with
lock.acquire()
try:
logging.debug('Lock acquired directly')
finally:
lock.release()

# with with
with lock:
logging.debug('Lock acquired directly')

open file

1
2
3
4
5
6
7
8
9
10
# without using with statement
file = open('file_path', 'w')
try:
file.write('hello world')
finally:
file.close()

# with using with statement
with open('file_path', 'w') as file:
file.write('hello world')

schedule job

APScheduler has four kinds of components:

  • triggers
  • job stores
  • executors
  • schedulers

Triggers contain the scheduling logic. Each job has its own trigger which determines when the job should be run next. Beyond their initial configuration, triggers are completely stateless.

Job stores house the scheduled jobs. The default job store simply keeps the jobs in memory, but others store them in various kinds of databases. A job’s data is serialized when it is saved to a persistent job store, and deserialized when it’s loaded back from it. Job stores (other than the default one) don’t keep the job data in memory, but act as middlemen for saving, loading, updating and searching jobs in the backend. Job stores must never be shared between schedulers.

Executors are what handle the running of the jobs. They do this typically by submitting the designated callable in a job to a thread or process pool. When the job is done, the executor notifies the scheduler which then emits an appropriate event.

Schedulers are what bind the rest together. You typically have only one scheduler running in your application. The application developer doesn’t normally deal with the job stores, executors or triggers directly. Instead, the scheduler provides the proper interface to handle all those. Configuring the job stores and executors is done through the scheduler, as is adding, modifying and removing jobs.


choose trigger
When you schedule a job, you need to choose a trigger for it. The trigger determines the logic by which the dates/times are calculated when the job will be run. APScheduler comes with three built-in trigger types:

  • date: use when you want to run the job just once at a certain point of time
  • interval: use when you want to run the job at fixed intervals of time
  • cron: use when you want to run the job periodically at certain time(s) of day

choose scheduler
Your choice of scheduler depends mostly on your programming environment and what you’ll be using APScheduler for. Here’s a quick guide for choosing a scheduler:

  • BlockingScheduler: use when the scheduler is the only thing running in your process
  • BackgroundScheduler: use when you’re not using any of the frameworks below, and want the scheduler to run in the background inside your application
  • AsyncIOScheduler: use if your application uses the asyncio module
  • GeventScheduler: use if your application uses gevent

Usage

start scheduler
Starting the scheduler is done by simply calling start() on the scheduler. For schedulers other than BlockingScheduler, this call will return immediately and you can continue the initialization process of your application, possibly adding jobs to the scheduler.

For BlockingScheduler, you will only want to call start() after you’re done with any initialization steps.

Add a job
There are two ways to add jobs to a scheduler

  • by calling add_job()
  • by decorating a function with scheduled_job()
1
2
3
4
5
6
# args = [name]
# kwargs= {'n': name}
def add_job(self, func, trigger=None, args=None, kwargs=None, id=None, name=None,
misfire_grace_time=undefined, coalesce=undefined, max_instances=undefined,
next_run_time=undefined, jobstore='default', executor='default',
replace_existing=False, **trigger_args):

The first way is the most common way to do it. The second way is mostly a convenience to declare jobs that don’t change during the application’s run time. The add_job() method returns a apscheduler.job.Job instance that you can use to modify or remove the job later.

1
2
3
4
5
6
7
8
9
10
11
12
13
name = 'jack'
# add a job with args, na=name, position args
def job(na):
print(na)

scheduler.add_job(job, trigger='interval', args=[name], seconds=3)

# here must named n!!!
def job(n):
print(n)

# named parameters
scheduler.add_job(job, trigger='interval', kwargs={'n': name}, seconds=3)

Remove a job
When you remove a job from the scheduler, it is removed from its associated job store and will not be executed anymore. There are two ways to make this happen:

  • by calling remove_job() with the job’s ID and job store alias
  • by calling remove() on the Job instance you got from add_job()
1
2
3
4
5
job = scheduler.add_job(myfunc, 'interval', minutes=2)
job.remove()

scheduler.add_job(myfunc, 'interval', minutes=2, id='my_job_id')
scheduler.remove_job('my_job_id')

list all jobs
Getting a list of scheduled jobs

To get a machine processable list of the scheduled jobs, you can use the get_jobs() method. It will return a list of Job instances.

Cron job

Parameter for cron type

1
2
3
# Schedules job_function to be run on the third Friday
# of June, July, August, November and December at 00:00, 01:00, 02:00 and 03:00
sched.add_job(job_function, 'cron', month='6-8,11-12', day='3rd fri', hour='0-3')
  • year (int|str) – 4-digit year
  • month (int|str) – month (1-12)
  • day (int|str) – day of month (1-31)
  • week (int|str) – ISO week (1-53)
  • day_of_week (int|str) – number or name of weekday (0-6 or mon,tue,wed,thu,fri,sat,sun)
  • hour (int|str) – hour (0-23)
  • minute (int|str) – minute (0-59)
  • second (int|str) – second (0-59)
  • start_date (datetime|str) – earliest possible date/time to trigger on (inclusive)
  • end_date (datetime|str) – latest possible date/time to trigger on (inclusive)
  • timezone (datetime.tzinfo|str) – time zone to use for the date/time calculations (defaults to scheduler timezone)
  • jitter (int|None) – delay the job execution by jitter seconds at most

document and schedule framework

use mongodb as default job store

By default, apscheduler uses memory as job store, that means jobs are lost when it restarts, for persistent job store, now it supports sql, mongodb, redis etc.

1
2
3
4
5
from apscheduler.jobstores.mongodb import MongoDBJobStore                       

jobstores = { 'default': MongoDBJobStore(database='apscheduler', collection='jobs', host='localhost', port=27017)}

scheduler = BackgroundScheduler(timezone='Asia/Shanghai', jobstores=jobstores)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#! /usr/bin/env python3
from apscheduler.schedulers.blocking import BlockingScheduler
import time

scheduler = BlockingScheduler(timezone='Asia/Shanghai')


def job():
print("%s: start job" % time.asctime())


# run every 1 hour and 1 minutes and 3 seconds
scheduler.add_job(job, 'interval', hours=1, minutes=1,seconds=3)

# run once at a given time
# scheduler.add_job(job, trigger='date', run_date='2022-01-04 09:04:01')

# cron job
# scheduler.add_job(job, trigger='cron', minute='*/5')

# start to run as a daemon
#scheduler.start()
<Job (id=91ed6da1d3ba41e6b6169d329243687b name=job)>

types

Python has support for optional “type hints” from python3.6.

These “type hints” are a special syntax that allow declaring the type of a variable, function, parameters etc

By declaring types for your variables, editors and tools can give you better support.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
from typing import List, Set, Dict, Tuple, Optional

# For simple built-in types, just use the name of the type
x: int = 1
x: float = 1.0
x: bool = True
x: str = "test"
x: bytes = b"test"

# For collections, the type of the collection item is in brackets
# (Python 3.9+)
x: list[int] = [1]
x: set[int] = {6, 7}

# In Python 3.8 and earlier, the name of the collection type is
# capitalized, and the type is imported from the 'typing' module
x: List[int] = [1]
x: Set[int] = {6, 7}

# This is how you annotate a function definition
def stringify(num: int) -> str:
return str(num)

# Use Union when something could be one of a few types
x: list[Union[int, str]] = [3, 5, "test", "fun"]

CLI Framework

For good application, it’s better provide good CLI for user, so that user can know how to use it with help, python provides a good built-in parser argparse, but there are other frameworks as well which is better than argparse, python-fire is the best on to go which is developed by Google, while click is also another good choice which is suitable to understand and auto generate help!!

python-fire

Python Fire is a library for automatically generating command-line interfaces from absolutely any Python object. It can help debug Python code more easily from the command line, create CLI interfaces to existing code, allow you to interactively explore code in a REPL, and simplify transitioning between Python and Bash (or any other shell).

You can call Fire on any Python object:functions, classes, modules, objects, dictionaries, lists, tuples, etc. They all work!

click (use me)

Recommended Way

Click is a Python package for creating beautiful command line interfaces in a composable way with as little code as necessary. It’s the “Command Line Interface Creation Kit”. It’s highly configurable but comes with sensible defaults out of the box.

It aims to make the process of writing command line tools quick and fun while also preventing any frustration caused by the inability to implement an intended CLI API.

Click in three points:

  • Arbitrary nesting of commands
  • Automatic help page generation
  • Supports lazy loading of subcommands at runtime
  • click options

Only one command

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import click


@click.command()
@click.option("--count", default=1, help="Number of greetings.")
@click.option("-name",
"--name",
prompt="Your name",
help="The person to greet.")
def cli(count, name):
# handler of command
"""Simple program that greets NAME for a total of COUNT times."""
for _ in range(count):
click.echo(f"Hello, {name}!")


if __name__ == '__main__':
cli()
1
2
3
4
5
6
7
8
9
$ ./cli.py --help
Usage: cli.py [OPTIONS]

Simple program that greets NAME for a total of COUNT times.

Options:
--count INTEGER Number of greetings.
-name, --name TEXT The person to greet.
--help Show this message and exit.

Multiple commands

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import click


@click.group()
def cli():
pass


# NOTE here is cli which is group function name!!! not click.command!!!
# so it's under cli group!!
@cli.command()
def initdb():
click.echo('Initialized the database')


@cli.command()
def dropdb():
click.echo('Dropped the database')


if __name__ == '__main__':
cli()
1
2
3
4
5
6
7
8
9
$./t.py
Usage: t.py [OPTIONS] COMMAND [ARGS]...

Options:
--help Show this message and exit.

Commands:
dropdb
initdb

argparse

Three steps to use this library

  • initialize the parser by argparse.ArgumentParser
  • add argument(keyword) by named argument: parser.add_argument(‘–port’) or positional argument: parser.add_argument(‘port’)
  • get the parsed args by parser.parse_args() then do more

Here is a basic example for it.

app.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87

# parametrs for add_argument

# name or flags - Either a name or a list of option strings, e.g. foo or -f, --foo.
# action - The basic type of action to be taken when this argument is encountered at the command line.
# nargs - The number of command-line arguments that should be consumed.
# const - A constant value required by some action and nargs selections.
# default - The value produced if the argument is absent from the command line.
# type - The type to which the command-line argument should be converted.
# default type is str!!! if no type specified, str is the type of that!!!
# choices - A container of the allowable values for the argument.
# required - Whether or not the command-line option may be omitted (optionals only).
# help - A brief description of what the argument does.
# metavar - A name for the argument in usage messages.
# dest - The name of the attribute to be added to the object returned by parse_args(). [if not privided, use name as keyword to store]


import sys
import argparse

cli_commands = {
'get': {
'usage': ["get <type>"]
},
'put': {
'usage': ["put <type> value"]
}
}


def contruct_help_commands():
help_commands = []
for command in cli_commands.keys():
help_commands.extend(
["- " + x for x in cli_commands[command]['usage']])
# here create a command list with '- ' insert'
# like ['- get <type>', '- put <type> value']
return help_commands

def create_cli():
commands = contruct_help_commands()

# initialize parser
parser = argparse.ArgumentParser(
description='app cli',
formatter_class=argparse.RawTextHelpFormatter)

parser.add_argument(
'--cmd', '-c', nargs='+',
help='Runs a CLI command where CMD is one of:\n%s' %
"\n".join(commands))
# '--cmd' is the keyword for user to use, as nargs is '+' means at least 1 parameter for it
# so parsed args is a list

# example
# ./app.py --cmd get LB
# ./app.py -c get LB
# after parsed
# args.cmd = ['get' 'LB']

# true or false parameter
parser.add_argument(
'--json', action='store_true',
help='Outputs in json format (default: %(default)s)')

# example, if provided True, not provided false
# ./app.py --json
# args.json = True

# int parameter
parser.add_argument(
'--count', type=int, default=10,
help='Outputs count (default: %(default)s)')

# example, if provided 20, not provided 10
# ./app.py --count 20
# args.count = 20


parser.add_argument('host', nargs=1, help='host address')
#positional, no need to proivde keyword, always the last one

args = parser.parse_args()
print(args)
# ./apppy --cmd get LB --json True www.test.com
# Namespace(client_id='edge', cmd=['get', 'LB'], json_output='True', host='wwww.test.com')

Some tips

zip() creates pairs from two sequences.

1
2
3
4
5
6
7
def step_two_sequences_pars():
countries = ('Japan', 'Korea', 'China')
cities = ('Tokyo', 'Seoul', 'Beijing')
for country, city in zip(countries, cities):
print(f'The capital of {country} is {city}')

step_two_sequences_pars()
The capital of Japan is Tokyo
The capital of Korea is Seoul
The capital of China is Beijing

better code snips

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# create a new list based on condition from a list
import sys

arr = [1, 2, -1]
positives = [val+10 for val in arr if val >= 0]
print(positives)

# open a file always this way
#with open('filename.txt', 'w') as fd:
# fd.write('Hello')

# compared with None
value = None
if value is None:
# some task
pass

# Use join() to concatenate strings which is faster
nst = 'a ' + 'b'
print(' '.join(['a', 'b']))

# Store unique values with Sets
my_list = [1, 2, 3, 4, 5, 6, 7, 7, 7]
my_set = set(my_list) # removes duplicates

#Save Memory With Generators when list is huge

# list comprehension
my_list = [i for i in range(10000)]
print(sum(my_list)) # 49995000
print(sys.getsizeof(my_list), 'bytes') # 87616 bytes

# generator comprehension
my_gen = (i for i in range(10000))
print(sum(my_gen)) # 49995000
print(sys.getsizeof(my_gen), 'bytes') # 112 bytes

two prints in the same line

1
2
3
4
import time
print("hello",end='\r')
time.sleep(2)
print("boy")
boylo

Ref