It has been a wild few months since moving back here to the great PNW. Something that recently happened is obtaining an award for a recent join patent that I filed with a couple of close friends / engineers at Microsoft back in October. The patent is based on a concept of flexible vm disk images, which would incorporate a layer of metadata to describe the different configuration standards in which to be used, however do away with the need to store those individual images directly on disk. This would also in turn mean that you can speed up your deployments easily, without the extra overhead.
Sorry for the lack of posts lately, but I am in the process currently of moving from SVC back to good ol’ Seattle Washington. Still working on Microsoft, except now I will be back to an area that makes much more sense for an aging techy like me. Cost of living will certainly feel a lot better too. I will be back soon though, hopefully with better consistancy in updates as well.
I was recently chatting with someone that made the statement; “The only reason why I use my text editor is because of tab completion, I am a minimalist pythonista.”. I responded with, “If your a minimalist, why don’t you have tab completion enabled in the interpreter?” They were unaware of the ability to enable tab completion within python. Here is a my PSA on how to enable this behavior. :)
By following these instructions you will be able to enable tab-completion within your python shell.
Preperation steps:
Before enabling tab-completion, you may need to install 2 python modules (rlcompleter, readline). While these libraries are mostly included with python2.6+, some versions (OS X, for example), require the updated version to allow readline to function correctly.
shell
12
pip install readline
pip install rlcompleter
Step 1:
If you don’t already have a ~/.pyrc file, this command will create one for you, which is required for this trick to work.
shell
1
#> touch ~/.pyrc
Step 2:
Now we will create a file within your homedir which will instruct python to bind tab completion at python launch.
Now to ensure that your newly created ~/.pyrc file is executed each time python starts, add the following to your ~/.bashrc (or equiv. shell rc).
shell
12345
#> export PYTHONSTARTUP="[PATH TO PYRC FILE]/.pyrc"*OR TO MAKE THE CHANGE PERSIST TERMINAL CLOSURE
#> echo export PYTHONSTARTUP="[PATH TO PYRC FILE].pyrc" >> ~/.bashrc
Now to test, execute the following:
shell
1
#> source ~/.bashrc #reloads your ~/.bashrc file (if you added the entry to your ~/.bashrc, else ignore)
Now we will open the actual python shell, and view the tab completion goodness.
In the above example, I have imported the “os” module, typed “os.” and have pressed . Now all of the possible matched object names available within the module are shown. viola, python tab completion enabled.
When looking at the Python programming language, some of the most powerful and, unfortunately very under utilized “macro” design pattern within the language. One of the reasons I believe this to be the case is that most explanations of decorators suck. When you read the word “decorator” in regards to application development, most people generally think of http://www.amazon.com/gp/product/0201633612/ref=ase_bruceeckelA/ . While the “decorator” in Python can indeed be implemented in this fashion of design pattern, it is a very limited version of what decorators can actually accomplish.
I believe that decorators can actually be thought of as more “macros”, than the classical design pattern aforementioned above. http://en.wikipedia.org/wiki/Macro_(computer_science) , as defined by Wikipedia is “a rule or pattern that specifies how a certain input sequence (often a sequence of characters) should be mapped to a replacement output sequence (also often a sequence of characters) according to a defined procedure.” In short, if you have some metadata that you want to apply to any class, function or object, dress it up with a decorator.
Enough already, get to the example!
Decorators allow you to inject or modify code in functions or classes. Sounds a bit like Aspect-Oriented Programming (AOP) in Java, doesn’t it? So lets say you have an action that you would like to perform at the entry point (execution) or exit point (return) of a class, function. This is a prime example as when to use a decorator.
@myDecorator# The @myDecorator denotes the the application of a decorator
Function decorators
So by affixing a @decoratorname on a line directly above a function (or object), denotes the application of the results of a decorator function. In the previous example, when the python parser passes over the “myFunction()”, “myFunction()” is compiled and is, in turn, passed to the “myDecorator code block. This code block creates a function-like object that is ultimately what is returned when calling the “myFunction()” function. Confusing? Maybe this example will help.
decorator_example2.py
123456789101112131415
classmyDecorator(object):definit(self,func_object):print("Hello from inside myDecorator.init()")func_object()# Execute func_object() to prove it has been executed.defcall(self):print("Hello from inside myDecorator.call()")@myDecoratordefmyFunction():print("Hello from inside myFunction()")print("Finished decorating myFunction()")myFunction()
When you execute the above code, your results will look something like this:
output
1234
Hello from inside myDecorator.init()
Hello from inside myFunction()
Finished decorating myFunction()
Hello from inside myDecorator.call()
Note that the initialization of myDecorator(object) is executed when myFunction() is called. Due to the fact that we call “func_object()”, which is just myFunction() passed into the decorator class.init as a function object labeled “func_object”. Generally, you’ll pass the function object in the constructor and later use it in the call() method.
When myFunction() is called after it has been decorated, we get completely different behavior; the myDecorator.call() method is called instead of the original code. This is due to the fact that decoration replaces the original function object with the result of the decoration. In our case, the myDecorator object replaces myFunction.
This is it for the introduction to Python decorators, however look out for part II, coming soon… ☺
I ended up on a roll tonight writing my documentation on python recursive functions and decided to continue writing the second portion of this blog entry ,first post found here. In this post I plan to provide you with a slightly more complicated example of a recursive function while also showing a side-by-side comparison of recursion vs. iteration (Spoiler alert, iteration wins!).
Italians and integers, breeding like rabbits!
My example algorithm that I am introducing for advanced recursive method usage is none other than the Fibonacci number.
The Fibonacci numbers are a sequence of the following integer values:
[0,1,1,2,3,5,8,13,21,34,55,89,144 …]
The Fibonacci numbers are defined by the equation:
equation
12
Fn = Fn-1 + Fn-2
with F0 = 0 and F1 = 1
The Fibonacci sequence(numbers) are named after the mathematician Leonardo of Pisa, who is better known as Fibonacci. In his book “Liber Abaci” (published 1202) he introduced the sequence as an exercise dealing with (biologically unrealistic) rabbit breeding habits. His sequence of the Fibonacci numbers begins with F1 = 1, while in modern mathematics the sequence starts with F0 = 0. But this has no effect on the other members of the sequence.
OK, now that you are familiar with the introduction of Fionacci numbers lets get to the gooey, nerdy center of this example! ☺
Solving for Fibonacci sequencing in Python
The Fibonacci numbers are the result of an artificial rabbit population, satisfying the following conditions:
A newly born pair of rabbits, one male, one female, build the initial population.
The rabbits are able to successfully mate at the age of one month. This means that at the end of the second month of life, the female rabbit gives birth to 2 “hoppy”(harhar), healthy rabbits. The new sibling pair of rabbits consist of 1 male and one female. Did I mention that all of the rabbit spawn are immortal?!? Every rabbit from here on out will never die and just keep producing offspring. What a job, eh? Every new pair of male|female rabbits will continue to mate after the second month of life until infinity. The Fibonacci numbers are the numbers of rabbit pairs after n months, i.e. after 10 months we will have F10 rabbits.
The Fibonacci equation is very easy to program, as your equation depicted within the code is almost 1:1 with the original equation:
The above example depicts the Fibonacci numbers solution by using recursive methods. Now I will provide an example of a python function that returns the same Fibonacci numbers only using iteration instead of recursion.
If you try both of these functions in your python interpreter you will notice that fibonacci_iterative method is “Orders of magnitude” faster than the fibonacci_recursive equivalent.
Why is recursion so slow?!
In Java, C, and Python, recursion is fairly expensive compared to iteration (in general) because it requires the allocation of a new stack frame. In some C compilers, one can use a compiler flag to eliminate this overhead, which transforms certain types of recursion (actually, certain types of tail calls) into jumps instead of function calls.
In functional programming language implementations, sometimes, iteration can be very expensive and recursion can be very cheap. In many, recursion is transformed into a simple jump, but changing the loop variable (which is mutable) sometimes requires some relatively heavy operations, especially on implementations, which support multiple threads of execution. Mutation is expensive in some of these environments because of the interaction between the mutator and the garbage collector, if both might be running at the same time.
There are practical ways that we can help along recursive functions in order to speed them up however. Lets move onto the race between recursion vs. iteration and then we can describe how to speed up our recursion functions in this example.
The great race
In the example below we are going to place both of our above Fibonacci functions into an importable python file. Then we are going to write a new script that imports / executes both functions while performing timing calculations comparing the execution times.
Now we will write a python script that will allow for us to import and execute the two functions within “import_me.py” and measure the execution times.
lets_race.py
1234567891011121314151617
fromtimeitimportTimerfromimport_meimportfibonacci_recursivet1=Timer("fibonacci_recursive(10)","from import_me import fibonacci_recursive")foriinrange(1,41):#Import, execute and time recursive function.s=("fibonacci_recursive("+str(i)+")")t1=Timer(s,"from import_me import fibonacci_recursive")time1=t1.timeit(3)#Import, execute and time iterative function. s=("fibonacci_iterative("+str(i)+")")t2=Timer(s,"from import_me import fibonacci_iterative")time2=t2.timeit(3)print("n=%2d, recursive exec time: %8.6f, iterative exec time: %7.6f, iterative percent faster: %10.2f"%(i,time1,time2,time1/time2))
In part three of this blog series I will cover in greater detail why the recursive solution is slower than the iterative solution. I will also cover some tips / tricks that we can do to help tune the recursive function in order to make it perform much like its iterative counterpart.
I have been asked recently to write up a description on recursion in regards to python recursive functions. I figured this is a topic that generally confuses most engineers that I work with and is probably worthy of a quick blog post.
So first of all what is recursion?
Recursive functions are functions that call themselves in their definition. Because a recursive function calls on itself to perform its task, it can make jobs that contain identical work on multiple data objects easier to conceptualize, plan and write. Recursion can also be quite taxing on the server in which it is being ran and also has limitations which can sometimes cause issues in the future. For example recursive functions, by default, Windows has a recursion limit of 1000 (as does OSX). Linux, depending on the flavor, can range but generally is 2147483647 (231 – 1).
*note to determine the value set on your system, open your python interpreter and run the following:
determine recursion limit
12
importsyssys.getrecursionlimit()
Another thing to consider when writing a recursive function is that, recursion is indeed the best approach. I have seen instances where introduction can actually cause code complication along with poor performance vs. other design patterns.
simple example
12
defsome_func(z):some_func(z)
Now the above example has an obvious issue in the fact that it never returns which causes a loop. This loop will continue to iterate until it reaches the system set limitation (as described previously). If you were to run the above code the end result would be the following snippet.
To avoid the previous example of recursion running uncontrollably until reaching the limits of the system, we have conditionals which are required to properly gate recursion. This means that functions that make use of recursion require conditions to be satisfied in order to either continue to recursively call itself or return. The common conditionals used are if/else.
Example recursion code snippet
The most common example to describe usage of recursion is within the factoring of factorial for a given number. In the spirit of this, I am going to provide a code example of using recursion to determine the factorial for a given number. As a quick explanation, if your unfamiliar with factorials. A factorial is a product of multiplication: the number resulting from multiplying a whole number by every whole number between itself and 1 inclusive. (n!, or n * n-1 * n-2 … 0).
factorial example
123456789101112131415161718192021222324
"""An example of a factorial number is the following:5! = 5 * 4 * 3 * 2 * 1Or 5! = 5 * 4!Etc…"""deffactorial(number):ifnumber<=1:""" return 1, which will also unwind all of the numbers included in determining the factorial. """return1else:""" Call this function again recursively only subtracting 1 prime number from the previously provided number. """returnnumber*factorial(number–1)printfactorial(5)
By copying the above code and executing in within python you will be provided with the factorial value of 5!. The expected return number is 120.
This topic is an interesting one that I have faced on a few occasions. There are a couple of different solution that I have found over the last couple of years that work depending on the situation. So let’s back up again and state the problem in the form of a question.
“How do you call a function in python guven a string of the functions name?”
Generally this behavior is not allowed due to the strict type casting of python, ie. a function must be a function type before being able to call. So how do you approach calling a function by using a string? *note that both of these examples are assuming that the functions being called are residing within imported modules.
Here are a couple of examples:
1.) Use the getattr() built-in function to evaluate a module object and provide a string attribute denoting the name of the function to execute.
*note This method is generally frowned upon as using getattr and eval are both possibly dangerous built in functions that can lead to security issues / unexpected results.
example:
simple_example
12345
importfoo# Your modulemethodCall=getattr(foo,'bar')result=methodCall()
2.) You can also write an abstract function capable of evaluating the string provided in a function object, and also evaluate if the function exists in a graceful way.
This approach is a bit more complex and makes use of a couple of different modules, along with leveraging decorator functions.
example:
Place this code into a importable module file called registry.py
As you can see example 2 is quite a bit more involved however can scale to supporting MANY different function template files. This means that you can further abstract your code into a template language to dynamically import with.
As I am a previous engineer at Amazon AWS I have plenty of experience with AWS services and have exclusively leveraged SQS in many projects. I have always had a mostly positive experience with Amazon AWS SQS https://aws.amazon.com/sqs/ however I have been curious for quite some time what the benefit of rabbitMQ http://www.rabbitmq.com/ could be vs. SQS. Message queuing is not a new concept and has existed for years within computer science/engineering see: http://en.wikipedia.org/wiki/Message_queue if you are unfamiliar with the concept.
*I would like to firstly point out that my experiences with both services have been with Python, however I did create a test instance and performed the same tests with Java yielding the same results.
So I have to start out by saying that Amazons offering for queuing services is fantastic, relatively cheap and easy. The community has also built some fantastic programmatic libraries to interact with the web services. So now you may ask, why look at other services? Well there can be some possible drawbacks of SQS as well that can cause you some problems depending on the situation.
One drawback of SQS implementation is the need for polling of a message queue to determine if new messages have appeared. This is a bit of an issue being as you must now model your application to perform a polling cycle in order to determine if new messages are available, and if so build the logic around consuming the message and popping it out of the queue. You must also be mindful of queue settings that control such things as TTL, maximum message length and which endpoints (if multiple are used) the message is destined for.
Beyond the small issue of having to be responsible for your own polling cycle, you are also billed based upon the number of requests to a queue. I believe the breakdown on billing is something along the lines of 1 million requests == $100.00. While this isn’t a ton of money it can still get quite costly if your distributed app has 100s of queue and must all be polled. Especially if your messages come in bursts, in which case you have a lot of empty queue polling which penalizes you. There are different approaches you can take with SQS to apply back-off algorithms to queuing logic, but the penalty is delay in all other inter-dependent services cascades.
So now enter RabbitMQ.
RabbitMQ is a message queue system based on Erlang and conforming to AMQP (a standard and heavily used message queue protocol). There are obvious overhead in the fact that you must host your own instances of RabbitMQ along with the infrastructure. Also obvious reliability in multi-AZ redundancy will need to be considered unless you continue to host your instance within EC2 and configure appropriately.
So now lets talk about the speed of RabbitMQ, I have one word to describe it. FAST! My testing was performed within Amazon EC2 / AWS within the same region|AZ to be fair. My test was simple, build a queue, spray 5,000 message to the queue as fast as possible, de-queue and discard message. There are many different configurations that RabbitMQ can support such as one-to-one, one-to-many, many-to-many, RPC. In my example I simply used the default which is one-to-one.
So now the “quarter mile” times between the two.
Amazon AWS SQS:
Region: IAD
Service: SQS
Total time: 5:42 minutes:seconds
RabbitMQ EC2:
Region: IAD
Service: EC2
Total time: 0:06 minutes:seconds
What does this prove!?!
Nothing really besides shoving a lot of messages down the pipe and pulling them out on the other side IS indeed faster using RabbitMQ. This does not prove however that the actual service is “better” than SQS by any stretch of the imagination. Both services provide pro/con, and SQS has a rich history of reliable service that affords a great mixture of safe queuing along with a rock solid infrastructure.
On the other hand depending on the configuration of RabbitMQ you can gain a lot of these safeties with seemingly smaller hit in performance in terms of delivery times. I hope to revisit this topic once I have more time to spend, hopefully this helps educate anyone that is posed with approaching these services in the future though.
In working on a project I was having some issues with storing data with a MySQL server that required many read/writes. Normally this wouldn’t be a huge deal except for the systems are states away from eachother with less than reliable throughput and no option for scalable cloud solutions. I did however have another host in close proximity that was capable of hosting a REDIS based server for me to read/write to. I have worked with REDIS once before but this experience has been AWESOME so far! Key/Data stores are one thing but the speed of this technology, when implemented correctly, blows my mind every time. If you haven’t had the chance to work with REDIS you should give it a try, here is a AWESOME interactive demo http://try.redis.io/