Log rotation for uwsgi and python related applications

Don’t send SIGHUP or restart your servers if you want to do log rotation.

SIGHUP is NOT used for log rotation, but used for configuration/module reload.
Take uwsgi as example, as Managing the uWSGI server described, SIGHUP will gracefully reload all the workers and the master process.

Thus, the newly coming connection cannot be accepted during the reload process.
If the uwsgi application need 5 seconds to reload, then tons of connection cannot be served at that time. It may also make false positive for health checking mechanisms.

My suggestions

  • Use SIGUSR1 for web servers, such as Nginx.
  • Use logging.handlers.WatchedFileHandler in the application.
  • Use logrotate copytruncate if application upstream does not provide log rotation mechanism. (log file size should not be too big; log missing is acceptable.)
  • Use uwsgi touch-logreopen for uwsgi log rotation.

P.S: You can also use RANDOM_DELAY to avoid all your machines are going to do log rotation at the same time.

Use connection pool in Python Requests

Requests is a great library for performing HTTP related request and response.
As it’s description, it supports connection pool automatically.

Requests takes all of the work out of Python HTTP/1.1 — making your integration with web services seamless. There’s no need to manually add query strings to your URLs, or to form-encode your POST data. Keep-alive and HTTP connection pooling are 100% automatic, powered by urllib3, which is embedded within Requests.

Actually, the following snippet will not use connection pool provided from urllib3.


Requests use urllib3.PoolManager in requests.adapters.HTTPAdapter. Which is mounted in requests.Session.__init__. Thus, for each request

def request(method, url, **kwargs):
    session = sessions.Session()
    return session.request(method=method, url=url, **kwargs)

will create new session, which make requests.request won’t gain any benefit from connection pool. Alternatively, you can use requests.Session to make Requests use connection pool powered by urllib3.

session = requests.Session()

I rewrite a benchmark_request.py from urllib3 benchmark.py.

The result

Completed requests_session_get in 7.067s
Completed requests_get in 9.162s

My C++11 async practice

I’d like to rewrite concurrent.futures.ProcessPoolExecutor.map may be slow in some cases by using C++11 async and future

It takes some time to remember C++ stuff, but it’s still fun to write C++ code 🙂

Sample Code

#include <iostream>
#include <cmath>
#include <future>
#include <queue>
#include <vector>
#include <tuple>
#include <chrono>
#include <iterator>
#include <memory>
#include <functional>

typedef uint32_t NUM_TYPE;

const size_t REPEAT_TIMES = 100;
const NUM_TYPE TEST_NUM = 200000;

const size_t MAX_WORKER = 4;

typedef std::tuple<NUM_TYPE, bool> RESULT_TYPE;
typedef std::priority_queue<NUM_TYPE, std::vector<NUM_TYPE>, std::greater<NUM_TYPE>> STORE_DATA_TYPE;

RESULT_TYPE is_prime(NUM_TYPE num) {
  if (num % 2 == 0) {
    return RESULT_TYPE(num, false);

  NUM_TYPE sqrt_num = static_cast<NUM_TYPE>(std::floor(std::sqrt(num)));
  for (NUM_TYPE i = 3; i < sqrt_num + 1; i+=2) {
    if (num % i == 0) {
      return RESULT_TYPE(num, false);
  return RESULT_TYPE(num, true);

std::queue<NUM_TYPE> is_prime_wrapper(std::vector<NUM_TYPE> nums) {
  std::queue<NUM_TYPE> result;
  for (auto & num : nums) {
    RESULT_TYPE prime = is_prime(num);
    if (std::get<1>(prime)) {
  return result;

STORE_DATA_TYPE async_worker(NUM_TYPE num) {
  std::vector<NUM_TYPE> nums;

  for (NUM_TYPE i = 0; i < num; ++i) {

  const size_t NUM_SIZE = num / MAX_WORKER;
  std::vector< std::future<std::queue<NUM_TYPE>> > futures;
  for (size_t i = 0 ; i < MAX_WORKER; ++i) {
    std::vector<NUM_TYPE> split_nums;
    if (i == MAX_WORKER - 1)  {
      split_nums = std::vector<NUM_TYPE>(std::begin(nums) + NUM_SIZE * i, std::end(nums));
    } else {
      split_nums = std::vector<NUM_TYPE>(std::begin(nums) + NUM_SIZE * i,
                                         std::begin(nums) + NUM_SIZE * (i+1));

  for (auto& worker : futures) {
    auto partial = worker.get();
    while (!partial.empty()) {
  return result;

int main() {
  using std::chrono::high_resolution_clock;
  using std::chrono::milliseconds;

  milliseconds total_ms(0);
  for (size_t i = 0; i < REPEAT_TIMES; ++i) {
    auto t0 = high_resolution_clock::now();
    STORE_DATA_TYPE result = async_worker(TEST_NUM);

    auto t1 = high_resolution_clock::now();
    total_ms += std::chrono::duration_cast<milliseconds>(t1 - t0);
  std::cout << "takes " << total_ms.count() / REPEAT_TIMES << " ms" << std::endl;
  return 0;


takes 49 ms
Categorized as C++ Tagged ,

concurrent.futures.ProcessPoolExecutor.map may be slow in some cases

A simple test using Python 3.3

Sample Code

from __future__ import print_function

from concurrent import futures
import math
import multiprocessing

def is_prime(num):
    if num % 2 == 0:
        return False

    sqrt_num = int(math.floor(math.sqrt(num)))
    for i in range(3, sqrt_num + 1, 2):
        if num % i == 0:
            return False

    return True

def prime_worker(count):
    return sorted(num for num in range(count) if is_prime(num))

def future_prime_worker(count):
    with futures.ProcessPoolExecutor(4) as executor:
        numbers = range(count)
        return sorted(num for num, prime in
                      zip(numbers, executor.map(is_prime, numbers)) if prime)

def multiprocess_prime_worker(count):
    pool = multiprocessing.Pool(4)
    numbers = range(count)
    return sorted(num for num, prime in
                  zip(numbers, pool.map(is_prime, numbers)) if prime)

if __name__ == '__main__':
    import timeit
    t = timeit.timeit("prime_worker(200000)",
                      setup="from __main__ import prime_worker")
    print (t)

    t = timeit.timeit("multiprocess_prime_worker(200000)",
                      setup="from __main__ import multiprocess_prime_worker")
    print (t)

    t = timeit.timeit("future_prime_worker(200000)",
                      setup="from __main__ import future_prime_worker")
    print (t)



concurrent.futures.ProcessPoolExecutor.map() doesn’t batch function arguments by chunks

Using wheel for python deployment

I’ve discussed with my colleagues for better deployment process which replaces existing RPM based deployment.

There are several projects provides binary format, virtualenv manipulation, and caching for python deployment.

For example:

They are both great projects, but somehow our requirements are trivial

  • Install python package system-wise (may not be virtualenv)
  • Avoid binary recompile, should be done in build machine once
  • The deployed machines should not install gcc and development tools
  • The deployed machines may have limited internet access rights

After some investigation, wheel seems a perfect solution.


# wheel need pip 1.4+
pip install --upgrade pip==dev
pip install wheel
pip wheel --wheel-dir=/tmp/wheelhouse flask
pip install --use-wheel --no-index --find-links=/tmp/wheelhouse flask

It also handles c-extension so well (we’re using MySQL-Python)
We can also deploy /tmp/wheelhouse to other deployment machines.

The result looks like

$ time pip install --pre --use-wheel --no-index --find-links=/tmp/wheelhouse flask
Ignoring indexes: https://pypi.python.org/simple/
Downloading/unpacking flask
Downloading/unpacking itsdangerous>=0.21 (from flask)
Downloading/unpacking Werkzeug>=0.7 (from flask)
Downloading/unpacking Jinja2>=2.4 (from flask)
Downloading/unpacking markupsafe (from Jinja2>=2.4->flask)
Installing collected packages: flask, itsdangerous, Werkzeug, Jinja2, markupsafe
Successfully installed flask itsdangerous Werkzeug Jinja2 markupsafe
Cleaning up...
pip install --pre --use-wheel --no-index flask 0.31s user 0.07s system 90% cpu 0.423 total

The PEP 427 was accepted, and the further pip release will contain wheel support.
I’m looking forward to seeing wheel will be used in our environment.

Use textwrap.dedent

If you’re writing long string in python, it will break indent and look like the followings

import string
import random

def generate_long_string(n):
    for _ in xrange(n):
        name = "".join([random.choice(string.letters) for i in xrange(15)])
        msg = """\
Hi %s,

""" % name
        yield msg

for msg in generate_long_string(3):
    print msg

You can use textwrap.dedent to solve it

import string
import random

from textwrap import dedent

def generate_long_string(n):
    for _ in xrange(n):
        name = "".join([random.choice(string.letters) for i in xrange(15)])
        msg = """\
            Hi %s,

            """ % name
        yield dedent(msg)

for msg in generate_long_string(3):
    print msg

Trend Writing Secure Code Materials

這應該是我最後一次教趨勢的 Engineer Training Program: Writing Secure Code

這原本是從 Microsoft 的 Writing Secure Code, Second Edition 找一些教材出來上。



  1. Buffer Overflow
  2. SQL Injection
  3. CSRF
  4. XSS
  5. Password Storage
  6. Threat Modeling



Git Repo