3-4 Hours
Then, We Can Start Python Programming
Code Style
Naming Conventions
Variables, functions, methods, packages, modules :
lower_case_with_underscores
Classes and Exceptions :
CapWords
Private methods :
_single_leading_underscore(self, ...)
Constants :
ALL_CAPS_WITH_UNDERSCORES
Avoid one-letter variables :
l, O, I
Avoid Redundant Labeling
Yes
import audio
core = audio.Core()
controller = audio.Controller()
No
import audio
core = audio.AudioCore()
controller = audio.AudioController()
Prefer "Reverse Notation"
Yes
elements = ...
elements_active = ...
elements_defunct = ...
No
elements = ...
active_elements = ...
defunct_elements ...
Line Length : Don't Stress Over 80-100 Characters
# Use parentheses for line continuations.
wiki = (
"The Colt Python is a .357 Magnum caliber revolver formerly "
"manufactured by Colt's Manufacturing Company of Hartford, "
"Connecticut. It is sometimes referred to as a Combat Magnum "
"same year as Smith & Wesson's M29 .44 Magnum."
)
30 - 30 Principle
A function should never exceed 30 lines.
A class can contain at most 30 methods.
Google Comment Style - Function
def func(arg1, arg2, arg3=None):
""" Intro line
Rant on here. Make sure it's at least 3-line long (j/k)
Args:
arg1: An argument
arg2: Another argument
arg3: optional argument
Returns:
Describe returned value(s) here
Raises:
Error
"""
pass
Google Comment Style - Class
class SampleClass(object):
""" Summary of class here.
Longer class information....
Longer class information....
Attributes:
likes_spam: A boolean indicating if we like SPAM or not.
eggs: An integer count of the eggs we have laid.
"""
def __init__(self, likes_spam=False):
"""Inits SampleClass with blah."""
self.likes_spam = likes_spam
self.eggs = 0
def public_method(self):
"""Performs operation blah."""
PEP 8 Python Style Guide
A style guide is to improve the readability of code and
make it consistent across the wide spectrum of Python code
-
Code Layout
- # Indentation
- # Maximum Line Length
- # Should a Line Break Before or After a Binary Operator?
- # Blank Lines
- # Imports
- # ...
-
Naming Conventions
- # Names to Avoid
- # Class Names
- # Global Variable Names
- # Function and Parameter Names
- # Constants
- # ...
-
Comments
- # Block Comments.
- # Inline Comments.
- # Documentation Strings
- # ...
-
Others
- # When to Use Trailing Commas.
- # Whitespace in Expressions and Statements
- # String Quotes
- # ...
What is Pythonic Code ?
The Way We Explore The Feature of Python Language
to produce Clear, Concise and Maintainable Code
Pythonic Code
Making The Program More Efficient
Dictionary For Performance
What are the difference between two algorithms ?
data_list = [...] # len = 500, 000
interesting_points = [...] # len = 100
for i in interesting_ids:
point = find_point_by_id_in_list(data_list, i)
interesting_points.append(point)
data_lookup = {...} # len = 500, 000
interesting_points = [...] # len = 100
for i in interesting_ids:
point = data_lookup(data_list, i)
interesting_points.append(point)
High-Performance Container Datatypes
namedtuple() |
factory function for creating tuple subclasses with named fields |
New in version 2.6. |
deque |
list-like container with fast appends and pops on either end |
New in version 2.4. |
Counter |
dict subclass for counting hashable objects |
New in version 2.7. |
OrderedDict |
dict subclass that remembers the order entries were added |
New in version 2.7. |
defaultdict |
dict subclass that calls a factory function to supply missing values |
New in version 2.5. |
Memory Efficiency with Slots
Custom types store their data in individualized, dynamic dictionaries via self.__dict__. Using __slots__ to limit available attribute names and move the name/key storage outside the instance to a type level can significantly improve memory usage.
class ImmutableThing:
__slots__ = ['a', 'b', 'c']
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
Pythonic Code
Efficient Build-In Tools
Bisect for Quick Search
Bisect provides support for maintaining a list in sorted order without having to sort the list after each insertion.
import bisect
import random
# Reset the seed
random.seed(1)
# Use bisect_left and insort_left.
l = []
for i in range(1, 5):
r = random.randint(1, 100)
position = bisect.bisect_left(l, r)
bisect.insort_left(l, r)
print '%2d %2d' % (r, position), l
$ python bisect_example.py
14 0 [14]
85 1 [14, 85]
77 1 [14, 77, 85]
26 1 [14, 26, 77, 85]
50 2 [14, 26, 50, 77, 85]
Pickle For Fast Object Serialization
import pickle
a = A_VERY_LARGE_OBJECT
with open('filename.pickle', 'wb') as handle:
pickle.dump(a, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('filename.pickle', 'rb') as handle:
b = pickle.load(handle)
print(a == b)
Array Serialization
from array import array
from random import random
floats = array('d', (random() for i in range(10**7)))
fp = open('floats.bin', 'wb')
floats.tofile(fp)
fp.close()
floats2 = array('d')
fp = open('floats.bin', 'rb')
floats2.fromfile(fp, 10**7)
fp.close()
Pythonic Code
Taking The Advantages of Decorators
What is Decorator ?
You Might Have Used Decorators
Even Without Knowing It
from flask import Flask
app = Flask(__name__)
@app.route("/")
def hello():
return "Hello World!"
from functools import lru_cache
@lru_cache(maxsize=None)
def fib(n):
if n < 2:
return n
return fib(n-1) + fib(n-2)
Decorator is Used To Modify The Inner Function
The Following Three Codes Have The Same Effect.
# State, before defining f, that a_decorator will be applied to it.
@a_decorator
def f(...):
...
def f(...):
...
# After defining f, apply a_decorator to it.
f = a_decorator(f)
def a_decorator():
...
def f(...):
...
...
return f
Print Decorated Function's Input Output
def deco(func):
def inner(*args):
print('Input :', args)
output = func(*args)
print('Output:', output)
return output
return inner
@deco
def add(a, b):
return a + b
>>> add(1, 2)
Input : (1, 2)
Output: 3
Example 1 - Use Decorator to Auto Register
Input, Output and Execution Time
import time
def timer(func):
def clock(*args):
t0 = time.perf_counter()
result = func(*args)
elapsed = time.perf_counter() - t0
name = func.__name__
arg_str = ', '.join(repr(arg) for arg in args)
print('[%0.8fs] %s(%s) -> %r' % (elapsed, name, arg_str, result))
return result
return clock
Example 1 - Use Decorator to Auto Register
Input, Output and Execution Time
import time
from timer import timer
@timer
def factorial(n):
return 1 if n < 2 else n*factorial(n-1)
if __name__=='__main__':
factorial(4)
$ python3 clockdeco_demo.py
[0.00000191s] factorial(1) -> 1
[0.00004911s] factorial(2) -> 2
[0.00008488s] factorial(3) -> 6
[0.00013208s] factorial(4) -> 24
Example 2 - Use Decorator to Make Class Singleton
class Singleton:
_singletons = dict()
def __init__(self, decorated):
self._decorated = decorated
def getInstance(self):
key = self._decorated.__name__
try:
return Singleton._singletons[key]
except KeyError:
Singleton._singletons[key] = self._decorated()
return Singleton._singletons[key]
def __call__(self):
raise Exception(
'Singletons must be accessed through the `getInstance` method.')
Example 2 - Use Decorator to Make Class Singleton
@Singleton
class Foo:
def __init__(self):
print('Foo created')
def bar(self, obj):
print(obj)
foo = Foo() # Wrong, raises Exception
foo = Foo.getInstance()
goo = Foo.getInstance()
print(goo is foo) # True
foo.bar('Hello, world! I m a singleton.')
Python Meta Class Functions
Name | Description |
---|---|
__new__ | The real function to create an object. |
__init__ | Initialize newly created object. |
__call__ | Make the class callable like a function. |
Python Meta Class Functions
class Foo:
def __new__(cls, *args, **kwargs):
print('Calling __new__')
return super(Foo, cls).__new__(cls)
def __init__(self, a):
print('Calling __init__')
self.a = a
def __call__(self, *args, **kwargs):
print('Calling __call__')
print(self.a)
>>> f = Foo(1)
Calling __new__
Calling __init__
>>> f()
Calling __call__
1
Pythonic Code
Meta Programming
Example I : __len__ & __getitem__ (1)
import collections
Card = collections.namedtuple('Card', ['rank', 'suit'])
class FrenchDeck:
ranks = [str(n) for n in range(2, 11)] + list('JQKA')
suits = 'spades diamonds clubs hearts'.split()
def __init__(self):
self._cards = [Card(rank, suit) for suit in self.suits
for rank in self.ranks]
def __len__(self):
return len(self._cards)
def __getitem__(self, position):
return self._cards[position]
Example I : __len__ & __getitem__ (2)
>>> deck = FrenchDeck()
>>> len(deck)
52
>>> deck[0]
Card(rank='2', suit='spades')
>>> deck[-1]
Card(rank='A', suit='hearts')
>>> from random import choice
>>> choice(deck)
Card(rank='3', suit='hearts')
>>> choice(deck)
Card(rank='K', suit='spades')
>>> choice(deck)
Card(rank='2', suit='clubs')
Example I : __len__ & __getitem__ (3)
>>> deck[:3]
[Card(rank='2', suit='spades'), Card(rank='3', suit='spades'),
Card(rank='4', suit='spades')]
>>> deck[12::13]
[Card(rank='A', suit='spades'), Card(rank='A', suit='diamonds'),
Card(rank='A', suit='clubs'), Card(rank='A', suit='hearts')]
>>> for card in deck:
... print(card)
Card(rank='2', suit='spades')
Card(rank='3', suit='spades')
Card(rank='4', suit='spades')
...
>>> for card in reversed(deck):
... print(card)
Card(rank='A', suit='hearts')
Card(rank='K', suit='hearts')
Card(rank='Q', suit='hearts')
...
Example II : __iter__
class Article:
def __init__(self, sentences):
self.sentences = sentences
def __iter__(self):
return (sentence for sentence in self.sentences)
class Sentence:
def __init__(self, words):
self.words = words
def __iter__(self):
return (word for word in self.words)
...
>>> for sentence in article:
>>> ... for word in sentence:
>>> ....... print(word)
Example III : __repr__
from array import array
import math
class Vector2d:
def __init__(self, x, y):
self.x = float(x)
self.y = float(y)
def __iter__(self):
return (i for i in (self.x, self.y))
def __repr__(self):
class_name = type(self).__name__
return '{}({!r}, {!r})'.format(class_name, *self)
...
>>> v = Vector2d(1, 2)
>>> print(v)
Vector2d(1.0, 2.0)
Object Oriented Design
Why is This Code Bad Designed ?
class UserSettings:
def __init__(self, user):
self.user = user
def change_setting(self, setting):
if self.verify_credential():
# do change setting
pass
def verify_credential(self):
# do verify credential
pass
Single Responsibility Principle
There should never be more than one reason for a class to change
Single Responsibility Principle - Good Code
class UserSettings:
def __init__(self, user):
self.user = user
self.auth = UserAuth(user)
def change_setting(self, setting):
if self.auth.verify_credential():
# do change setting
pass
class UserAuth:
def __init__(self, user):
self.user = user
def verify_credential(self):
# do verify credential
pass
Is This A Good Code ?
class Rectangle:
def __init__(self, width, height):
self.width = width
self.height = height
class AreaCalculator:
def compute_area(self, shapes):
""" Compute the sum area of a shape collection """
area = 0.0
for shape in shapes:
area += shape.width * shape.height
return area
Possible Extension 1
Collection needs to contain circle ...
A Quick Solution Based On Previous Design
class Rectangle:
...
class Circle:
def __init__(self, radius):
self.radius = radius
class AreaCalculator:
def compute_area(self, shapes):
""" Compute the sum area of a shape collection """
area = 0.0
for shape in shapes:
# Check type here
if isinstance(shape, Rectangle):
area += shape.width * shape.height
else:
area += 3.14 * shape.radius * shape.radius
return area
Possible Extension 2
What if the collection needs to contain triangle, diamond, octagon etc ... ?
Open / Closed Principle
Software entities should be open for extension, but closed for modification
Open / Closed Principle - Good Code
from abc import ABC, abstractmethod
class Shape(ABC):
@abstractmethod
def area(self):
pass
class Rectangle(Shape):
def __init__(self, width, height):
self.width = width
self.height = height
def area(self):
return self.width * self.height
class Circle(Shape):
def __init__(self, radius):
self.radius = radius
def area(self):
return self.radius * self.radius * 3.14
class AreaCalculator:
def compute_area(self, shapes):
""" Compute the sum area of a shape collection """
area = 0.0
for shape in shapes:
area += shape.area()
return area
Is This A Good Design ?
from abc import ABC, abstractmethod
class IEmployee(ABC):
@abstractmethod
def work(self):
pass
@abstractmethod
def eat(self):
pass
class Researcher(IEmployee):
def work(self):
pass
def eat(self):
pass
class Programmer(IEmployee):
def work(self):
pass
def eat(self):
pass
Interface Segregation Principle
Clients should not be forced to depend upon interfaces that they do not use.
Interface Segregation Principle - Good Code
from abc import ABC, abstractmethod
class Workable(ABC):
@abstractmethod
def work(self):
pass
class Feedable(ABC):
@abstractmethod
def eat(self):
pass
class Researcher(Workable, Feedable):
def work(self):
pass
def eat(self):
pass
class Programmer(Workable, Feedable):
def work(self):
pass
def eat(self):
pass
Interface Segregation Principle - A Real Example
Composition Over Inheritance - Bad Implementation
OO Design Principles
-
SRP
Single Responsibility PrincipleThere should never be more than one reason for a class to change
-
OCP
Open / Closed PrincipleSoftware entities should be open for extension, but closed for modification
-
ISP
Interface Segregation PrincipleClients should not be forced to depend upon interfaces that they do not use.
Python Inheritance
1. Do Not Inherit From Build-In Data Type.
# Bad Implementation
class DoppelDict(dict):
def __setitem__(self, key, value):
super().__setitem__(key, [value] * 2)
>>> dd = DoppelDict(one=1)
>>> dd
{'one': 1}
>>> dd['two'] = 2
>>> dd
{'one': 1, 'two': [2, 2]}
>>> dd.update(three=3)
>>> dd
{'three': 3, 'one': 1, 'two': [2, 2]}
2. Mixin for Multi-Inheritances
Example I - A General Repr Mixin
import reprlib
class ReprLibMixin(object):
def __repr__(self):
return "<{} {attr}>".format(
self.__class__.__name__,
attr=" ".join("{}={}".format(k, reprlib.repr(v)) for k, v in sorted(self.__dict__.items())),
)
class Word(ReprLibMixin):
def __init__(self, text, sentence_index, article_index, pos_tag):
self.text = text
self.sentence_index = sentence_index
self.article_index = article_index
self.pos_tag = pos_tag
Example II - Comparable Mixin
class Comparable(object):
def __ne__(self, other):
return not (self == other)
def __lt__(self, other):
return self <= other and (self != other)
def __gt__(self, other):
return not self <= other
def __ge__(self, other):
return self == other or self > other
class Integer(Comparable):
def __init__(self, i):
self.i = i
class Char(Comparable):
def __init__(self, c):
self.c = c
3. Enrich Constructor With
Static Factory Method
class Sentence:
def __init__(words, article_indices, sentence_indices, ...):
pass
@staticmethod
def fromText(sentence_str):
pass
@staticmethod
def fromLtpJson(ltp_json):
pass
Performance Tuning
Step 1: Make A Better Algorithmn
- 1. Binary Search
- 2. Quick Sort
- 3. Data Structure: BST, Trie, Interval Tree, Stack etc.
- 4. Dynamic Programming (Tabulation, Memoizatation)
- 5. Save Previous Results in Dict
- ...
Step 2: Cache
from functools import lru_cache
@lru_cache(maxsize=None)
def fib(n):
if n <= 0:
print("Incorrect input")
elif n == 1:
return 0
elif n == 2:
return 1
else:
return fib(n-1) + fib(n-2)
Step 3: Python Coroutine
import aiohttp
import asyncio
URL = 'https://www.youtube.com/'
async def job(sess):
resp = await sess.get(URL)
return str(resp.url)
async def main(loop):
async with aiohttp.ClientSession() as sess:
tasks = [loop.create_task(job(sess)) for _ in range(2)]
finished, unfinished = await asyncio.wait(tasks)
for r in finished:
print(r.result())
loop = asyncio.get_event_loop()
loop.run_until_complete(main(loop))
loop.close()
Step 4: Python Multi Threading
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
p = Pool(5)
print(p.map(f, [1, 2, 3]))
Step 5: Others...
- 1. Cython
- 2. numba
- 3. PyPI
- 4. f2py
- ...
Profiling
import cProfile
import re
cProfile.run('re.compile("foo|bar")')
197 function calls (192 primitive calls) in 0.002 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.001 0.001 :1()
1 0.000 0.000 0.001 0.001 re.py:212(compile)
1 0.000 0.000 0.001 0.001 re.py:268(_compile)
1 0.000 0.000 0.000 0.000 sre_compile.py:172(_compile_charset)
1 0.000 0.000 0.000 0.000 sre_compile.py:201(_optimize_charset)
4 0.000 0.000 0.000 0.000 sre_compile.py:25(_identityfunction)
3/1 0.000 0.000 0.000 0.000 sre_compile.py:33(_compile)