Data Loading...

9781838551216 Flipbook PDF


287 Views
10 Downloads
FLIP PDF 1.53MB

DOWNLOAD FLIP

REPORT DMCA

Book Collection

Learning Path

Advanced Python Programming

Build high performance, concurrent, and multi-threaded apps with Python using proven design patterns

Dr. Gabriele Lanaro, Quan Nguyen and Sakis Kasampalis FOR SALE IN INDIA ONLY

www.packt.com

Advanced Python Programming Build high performance, concurrent, and multi-threaded apps with Python using proven design patterns

Dr. Gabriele Lanaro Quan Nguyen Sakis Kasampalis

BIRMINGHAM - MUMBAI

Advanced Python Programming Copyright © 2019 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. First Published: February 2019 Production Reference: 2280219 Published by Packt Publishing Ltd. Livery Place, 35 Livery Street Birmingham, B3 2PB, U.K. ISBN 978-1-83855-121-6 www.packtpub.com

mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry-leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why Subscribe? Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals Improve your learning with Skill Plans built especially for you Get a free eBook or video every month Mapt is fully searchable Copy and paste, print, and bookmark content

Packt.com Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details. At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors About the Authors Dr. Gabriele Lanaro is passionate about good software and is the author of the chemlab and chemview open source packages. His interests span machine learning, numerical computing visualization, and web technologies. In 2013, he authored the first edition of the book High Performance Python Programming. He has been conducting research to study the formation and growth of crystals using medium and large-scale computer simulations. In 2017, he obtained his PhD in theoretical chemistry. Quan Nguyen is a Python enthusiast and data scientist. Currently, he works as a data analysis engineer at Micron Technology, Inc. With a strong background in mathematics and statistics, Quan is interested in the fields of scientific computing and machine learning. With data analysis being his focus, Quan also enjoys incorporating technology automation into everyday tasks through programming. Quan's passion for Python programming has led him to be heavily involved in the Python community. He started as a primary contributor for the Python for Scientists and Engineers book and various open source projects on GitHub. Quan is also a writer for the Python software foundation and an occasional content contributor for DataScience.com (part of Oracle). Sakis Kasampalis is a software engineer living in the Netherlands. He is not dogmatic about particular programming languages and tools; his principle is that the right tool should be used for the right job. One of his favorite tools is Python because he finds it very productive. Sakis has also technically reviewed the Mastering Object-oriented Python and Learning Python Design Patterns books, both published by Packt Publishing.

Packt Is Searching for Authors Like You If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents Preface

1

Chapter 1: Benchmarking and Profiling Designing your application Writing tests and benchmarks Timing your benchmark

Better tests and benchmarks with pytest-benchmark Finding bottlenecks with cProfile Profile line by line with line_profiler Optimizing our code The dis module Profiling memory usage with memory_profiler Summary Chapter 2: Pure Python Optimizations Useful algorithms and data structures Lists and deques Dictionaries

Building an in-memory search index using a hash map

Sets Heaps Tries

Caching and memoization Joblib

Comprehensions and generators Summary Chapter 3: Fast Array Operations with NumPy and Pandas Getting started with NumPy Creating arrays Accessing arrays Broadcasting Mathematical operations Calculating the norm

Rewriting the particle simulator in NumPy Reaching optimal performance with numexpr Pandas Pandas fundamentals

Indexing Series and DataFrame objects

Database-style operations with Pandas

9 10 16 18 21 24 29 31 33 34 37 39 40 41 43 45 47 48 50 52 55 56 58 59 60 60 62 67 70 71 71 75 77 77 79 81

Table of Contents

Mapping Grouping, aggregations, and transforms Joining

Summary Chapter 4: C Performance with Cython Compiling Cython extensions Adding static types Variables Functions Classes

Sharing declarations Working with arrays

C arrays and pointers NumPy arrays Typed memoryviews

Particle simulator in Cython Profiling Cython Using Cython with Jupyter Summary Chapter 5: Exploring Compilers Numba

First steps with Numba Type specializations Object mode versus native mode Numba and NumPy Universal functions with Numba Generalized universal functions

JIT classes Limitations in Numba

The PyPy project

Setting up PyPy Running a particle simulator in PyPy

Other interesting projects Summary Chapter 6: Implementing Concurrency Asynchronous programming Waiting for I/O Concurrency Callbacks Futures Event loops

The asyncio framework Coroutines

[ ii ]

82 84 86 88

89 89 92 92 94 95 97 98 98 101 102 104 108 112 115 117 118 118 120 121 124 124 126 129 132 133 134 135 136 137 139 140 140 141 143 146 148 151 152

Table of Contents

Converting blocking code into non-blocking code

Reactive programming

Observables Useful operators Hot and cold observables Building a CPU monitor

Summary Chapter 7: Parallel Processing Introduction to parallel programming Graphic processing units

Using multiple processes

The Process and Pool classes The Executor interface Monte Carlo approximation of pi Synchronization and locks

Parallel Cython with OpenMP Automatic parallelism Getting started with Theano Profiling Theano

Tensorflow Running code on a GPU

Summary Chapter 8: Advanced Introduction to Concurrent and Parallel Programming Technical requirements What is concurrency? Concurrent versus sequential Example 1 – checking whether a non-negative number is prime Concurrent versus parallel A quick metaphor

Not everything should be made concurrent Embarrassingly parallel Inherently sequential

Example 2 – inherently sequential tasks

I/O bound

The history, present, and future of concurrency The history of concurrency The present The future

A brief overview of mastering concurrency in Python Why Python?

Setting up your Python environment General setup

Summary

[ iii ]

156 158 158 161 165 168 171 173 174 176 177 178 180 181 184 187 189 190 195 197 199 203 205 206 206 206 207 210 211 211 212 212 213 215 215 216 217 219 221 222 224 224 225

Table of Contents

Questions Further reading Chapter 9: Amdahl's Law Technical requirements Amdahl's Law

226 226

Terminology

Formula and interpretation

The formula for Amdahl's Law A quick example

Implications

Amdahl's Law's relationship to the law of diminishing returns How to simulate in Python Practical applications of Amdahl's Law Summary Questions Further reading Chapter 10: Working with Threads in Python Technical requirements The concept of a thread Threads versus processes Multithreading An example in Python

An overview of the threading module The thread module in Python 2 The threading module in Python 3

Creating a new thread in Python

Starting a thread with the thread module Starting a thread with the threading module

Synchronizing threads

The concept of thread synchronization The threading.Lock class An example in Python

Multithreaded priority queue

A connection between real-life and programmatic queues The queue module Queuing in concurrent programming Multithreaded priority queue

Summary Questions Further reading Chapter 11: Using the with Statement in Threads Technical requirements [ iv ]

227 227 228 228 229 229 230 230 231 232 236 237 238 238 239 240 240 240 241 243 247 247 247 248 249 251 254 254 255 255 257 257 258 259 263 264 265 265 267 267

Table of Contents

Context management

Starting from managing files The with statement as a context manager The syntax of the with statement

The with statement in concurrent programming Example of deadlock handling

Summary Questions Further reading Chapter 12: Concurrent Web Requests Technical requirements The basics of web requests HTML HTTP requests HTTP status code

The requests module

Making a request in Python Running a ping test

Concurrent web requests

Spawning multiple threads Refactoring request logic

The problem of timeout

Support from httpstat.us and simulation in Python Timeout specifications

Good practices in making web requests

Consider the terms of service and data-collecting policies Error handling Update your program regularly Avoid making a large number of requests

Summary Questions Further reading Chapter 13: Working with Processes in Python Technical requirements The concept of a process Processes versus threads Multiprocessing Introductory example in Python

An overview of the multiprocessing module

The process class The Pool class Determining the current process, waiting, and terminating processes Determining the current process

[v]

268 268 269 271 271 272 274 274 275 277 277 278 278 280 281 282 283 285 286 287 289 291 291 292 296 296 296 297 297 299 299 299 301 302 302 304 305 307 309 309 310 311 311

Table of Contents

Waiting for processes Terminating processes

Interprocess communication

Message passing for a single worker Message passing between several workers

Summary Questions Further reading Chapter 14: Reduction Operators in Processes Technical requirements The concept of reduction operators Properties of a reduction operator Examples and non-examples

Example implementation in Python Real-life applications of concurrent reduction operators Summary Questions Further reading Chapter 15: Concurrent Image Processing Technical requirements Image processing fundamentals Python as an image processing tool Installing OpenCV and NumPy

Computer image basics

RGB values Pixels and image files Coordinates inside an image

OpenCV API Image processing techniques Grayscaling Thresholding

Applying concurrency to image processing Good concurrent image processing practices Choosing the correct way (out of many) Spawning an appropriate number of processes Processing input/output concurrently

Summary Questions Further reading Chapter 16: Introduction to Asynchronous Programming Technical requirements A quick analogy Asynchronous versus other programming models [ vi ]

314 317 317 318 320 326 327 327

329 329 330 330 331 333 338 338 339 339 341 341 342 342 343 344 344 345 345 346 348 349 351 356 360 360 363 363 363 364 364 365 365 366 367

Table of Contents

Asynchronous versus synchronous programming Asynchronous versus threading and multiprocessing

An example in Python Summary Questions Further reading Chapter 17: Implementing Asynchronous Programming in Python Technical requirements The asyncio module Coroutines, event loops, and futures Asyncio API

The asyncio framework in action Asynchronously counting down A note about blocking functions Asynchronous prime-checking Improvements from Python 3.7 Inherently blocking tasks

concurrent.futures as a solution for blocking tasks Changes in the framework Examples in Python

Summary Questions Further reading Chapter 18: Building Communication Channels with asyncio Technical requirements The ecosystem of communication channels Communication protocol layers Asynchronous programming for communication channels Transports and protocols in asyncio The big picture of asyncio's server client

Python example

Starting a server Installing Telnet Simulating a connection channel Sending messages back to clients Closing the transports

Client-side communication with aiohttp Installing aiohttp and aiofiles Fetching a website's HTML code Writing files asynchronously

Summary Questions Further reading

[ vii ]

368 369 370 373 373 374 375 375 376 376 378 379 380 384 385 389 390 391 392 392 396 397 398 399 400 400 400 402 403 405 406 406 408 409 410 411 413 414 414 416 418 419 419

Table of Contents

Chapter 19: Deadlocks Technical requirements The concept of deadlock

The Dining Philosophers problem Deadlock in a concurrent system Python simulation

Approaches to deadlock situations

Implementing ranking among resources Ignoring locks and sharing resources An additional note about locks Concluding note on deadlock solutions

The concept of livelock Summary Questions Further reading Chapter 20: Starvation Technical requirements The concept of starvation

What is starvation? Scheduling Causes of starvation Starvation's relationship to deadlock

The readers-writers problem

Problem statement The first readers-writers problem The second readers-writers problem The third readers-writers problem

Solutions to starvation Summary Questions Further reading Chapter 21: Race Conditions Technical requirements The concept of race conditions Critical sections How race conditions occur

Simulating race conditions in Python Locks as a solution to race conditions The effectiveness of locks Implementation in Python The downside of locks

Turning a concurrent program sequential Locks do not lock anything

[ viii ]

421 421 422 422 425 426 430 430 436 438 439 439 442 442 442 443 443 444 444 445 446 447 448 448 449 453 456 458 459 460 460 461 461 462 462 463 465 467 467 469 470 471 473

Table of Contents

Race conditions in real life Security Operating systems Networking

Summary Questions Further reading Chapter 22: The Global Interpreter Lock Technical requirements An introduction to the Global Interpreter Lock An analysis of memory management in Python The problem that the GIL addresses Problems raised by the GIL

The potential removal of the GIL from Python How to work with the GIL

Implementing multiprocessing, rather than multithreading Getting around the GIL with native extensions Utilizing a different Python interpreter

Summary Questions Further reading Chapter 23: The Factory Pattern The factory method

Real-world examples Use cases Implementing the factory method

The abstract factory

Real-world examples Use cases Implementing the abstract factory pattern

Summary Chapter 24: The Builder Pattern Real-world examples Use cases Implementation Summary Chapter 25: Other Creational Patterns The prototype pattern Real-world examples Use cases Implementation

Singleton

[ ix ]

474 474 475 476 477 477 478 479 479 480 480 483 484 486 486 487 489 489 489 490 490 491 492 493 493 494 502 502 503 503 508 509 510 511 515 521 523 524 524 525 525 529

Learning Path Advanced Python Programming This Learning Path shows you how to leverage the power of both native and third-party Python libraries for building robust and responsive applications. You will learn about profilers and reactive programming, concurrency and parallelism, as well as tools for making your apps quick and efficient. You will discover how to write code for parallel architectures using TensorFlow and Theano, and use a cluster of computers for large-scale computations using technologies such as Dask and PySpark. With the knowledge of how Python design patterns work, you will be able to clone objects, secure interfaces, dynamically choose algorithms, and accomplish much more in high performance computing. By the end of this Learning Path, you will have the skills and confidence to build engaging models that quickly offer efficient solutions to your problems. This Learning Path includes content from the following Packt products:

Things you will learn: •

Use NumPy and pandas to import and manipulate datasets



Achieve native performance with Cython and Numba



Write asynchronous code using asyncio and RxPy



Design highly scalable programs with application scaffolding



Explore abstract methods to maintain data consistency



Clone objects using the prototype pattern



Use the adapter pattern to make incompatible interfaces compatible



Employ the strategy pattern to dynamically choose an algorithm

• Python High Performance - Second Edition by Gabriele Lanaro • Mastering Concurrency in Python by Quan Nguyen • Mastering Python Design Patterns by Sakis Kasampalis

www.packt.com

FOR SALE IN INDIA ONLY