Setting up python development environment with buildout

Attn: Checkout Conda before trying this.

Buildout is a Python-based build system for creating, assembling and deploying applications from multiple parts, some of which may be non-Python-based. It lets you create a buildout configuration and reproduce the same software later.
buildout.org

I’ve documented the steps required to create a simple buildout based project.

  1. Start by creating a project directory and initialise a virtual environment inside the project directory.

    $ mkdir word_count_buildout && cd word_count_buildout
    $ virtualenv --no-site-packages .env
    $ source .env/bin/activate
    
  2. Fetch bootstrap-buildout.py. A common script required to create necessary directories and eggs (like setuptools, etc. )

     
    $ wget https://bootstrap.pypa.io/bootstrap-buildout.py
    
  3. Create a buildout configuration file

    $ vi buildout.cfg
    

    copy & paste the following snippet

    [buildout]
    develop = .
    parts = job job-test scripts zipeggs
    # index = http://mypypicloud.example.com:6543/pypi/
    
    [job]
    recipe = zc.recipe.egg
    interpreter = python
    eggs = wordcount-job
    
    [job-test]
    recipe = pbp.recipe.noserunner
    eggs = pbp.recipe.noserunner
    working-directory = ${buildout:directory}
    
    [scripts]
    recipe = zc.recipe.egg:scripts
    eggs = dumbo
    
    [zipeggs]
    recipe = zipeggs:zipeggs
    target = dist 
    source = eggs
    

    basic config file structure explained here.

    • ln:4 index If using private PyPi, uncomment and replace the URL.
    • ln:21 recipe = zipeggs:zipeggs – a buildout recipe to zip all flattened/unzipped eggs, flattened/unzipped eggs are convenient while developing (for debugging purpose) and they load faster. Dumbo requires zipped eggs to be passed via -libegg param, zipeggs recipe can generate zipped eggs under target directory (dist). more details @tamizhgeek repo.
  4. Execute the bootstrap file

    $ python bootstrap-buildout.py
    
    Downloading https://pypi.python.org/packages/source/s/setuptools/setuptools-18.1.zip
    Extracting in /tmp/tmpZ7ki33
    Now working in /tmp/tmpZ7ki33/setuptools-18.1
    Building a Setuptools egg in /tmp/bootstrap-q4Xfc5
    /tmp/bootstrap-q4Xfc5/setuptools-18.1-py2.7.egg
    Creating directory '/home/raj/workspace/word_count/eggs'.
    Creating directory '/home/raj/workspace/word_count/bin'.
    Creating directory '/home/raj/workspace/word_count/parts'.
    Creating directory '/home/raj/workspace/word_count/develop-eggs'.
    Generated script '/home/raj/workspace/word_count/bin/buildout'.
    

    The script has created few directories and a buildout script inside bin directory. Read more about the directory structure here

  5. Create setup.py.

    $ vi setup.py
    
    from setuptools import setup, find_packages
    import os
    version = os.environ.get("PIPELINE_LABEL", "1.0")
    setup(
        name="wordcount-job",
        version=version,
        packages=find_packages(),
        zip_safe=True,
        install_requires=[
            'dumbo'
        ]
    )
    

    Read more about setuptools here.
    Any changes to setup.py and buildout.cfg should be followed by executing ./bin/buildout

  6. Create a python module for a simple word-count dumbo job.

    $ mkdir wordcount-job && touch wordcount-job/__init__.py
    $ vi wordcount-job/wordcount.py
    

    copy & paste the following code.

    def mapper(key, value):
        for word in value.split():
            yield word, 1
    
    def reducer(key, values):
        yield key, sum(values)
    
    if __name__ == "__main__":
        import dumbo
        dumbo.run(mapper, reducer)
    
  7. Finally, run the buildout script to fetch artifacts from private or central pypi

    $ ./bin/buildout
    

    All the dependencies (and its dependencies) mentioned in setup.py are collected under eggs/ directory. Zipped eggs are available under dist/ directory and executable scripts with dependencies wired are generated under bin directory. Try viewing the contents of bin/dumbo.

  8. To run the job

    $ ./bin/dumbo start wordcount-job/wordcount.py -input /tmp/input -output /tmp/output 
    

    wordcount.py can access all the dependencies (under eggs/ directory) mentioned in setup.py.

  9. To run tests.

    $ ./bin/job-test
    

    read more about pbp.recipe.noserunner recipe and nose

  10. To build egg

    $ ./bin/buildout setup . bdist_egg
    
  11. Private PyPi repos can also be used to distribute python eggs. To publish an egg to a private pypi, create config for pypicloud under home directory

    $ cd $HOME && vi .pypirc
    

    copy & paste the following

    [distutils]
    index-servers = my-pypi
     
    [my-pypi]
    repository: http://mypypicloud.example.com:6543/pypi/
    username: username
    password: password
    

    Under project working directory

    $ ./bin/buildout setup . bdist_egg upload -r my-pypi
    
  12. Source code
  13. Beer to bud @azhaguselvan for support and stuff.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.