Feb. 22nd, 2012

Django Tips: FileField and uploadto

The following tip, should help in understanding how the FileField handles the path and url for a file referenced in the database.

The Model Object

The FileField model in Django is used to store files in Django. The file itself is not stored in the database but on the filesystem, with a reference to the file location saved in the database. Once saved, the file can be easily accessed with 2 functions:
  • document.path – returns the absolute location to the file in relation to the filesystem
  • document.url – returns the absolute location to the file to be used in a web browser

Settings

The Django docs reference 2 settings variables that should be set:
  • MEDIA_ROOT – Absolute path to the directory that holds media…
  • MEDIA_URL – URL that handles the media served from MEDIA_ROOT
Now, I don’t like to expose my filesystem structure to the internet, so I set MEDIA_URL to a specific directory. Then use apache (or a dedicated url configuration) to serve the files. For example:
MEDIA_URL='/media'
I also like to isolate my media so that there can be no conflicts, so I use 2 settings variable to store the media root:
MEDIA_ROOT = os.path.join( os.path.dirname(__file__), 'media/').replace('\\','/')
MODEL_DOC_ROOT = os.path.join( 'model', 'documents' ).replace('\\','/')
The url configuration is:
( r'^model/documents/(?P<path>.*)$', 'django.views.static.serve',
        { 'document_root': '%s/%s' % ( settings.MEDIA_ROOT, settings.MODEL_DOC_ROOT ) } ),

uploadto

Default

The Django docs for FileField say:
A local filesystem path that will be appended to your MEDIA_ROOT setting to determine the value of the url attribute.
This is important, because if you include the complete path, then document.path will be correct, but document.url will have the complete filesystem path in the url…ack!

As a function

The Django docs also say:
This may also be a callable, such as a function, which will be called to obtain the upload path, including the filename. This callable must be able to accept two arguments, and return a Unix-style path (with forward slashes) to be passed along to the storage system.
If I expect many files to be uploaded, it is cleaner to split them into subdirectories. My function turned out like the following:
def doc_location( instance, filename ):
    root_path = path.join( settings.MODEL_DOC_ROOT, instance.name.slug ).replace('\\','/')
    full_path = settings.MEDIA_ROOT + path.join( '/', root_path ).replace('\\','/')
    
    if not path.exists( full_path ):
        mkdir( full_path )

    return path.join( root_path, filename ).replace('\\','/')

Obscure Pitfalls

The above works quite well…. but what the docs don’t say is:
  • If MODEL_DOC_ROOT starts with ‘/’ then MEDIA_ROOT is NOT appended as indicated the docs!
  • In the doc_location function, if settings.MEDIA_ROOT is included in path.join it doesn’t get added!?

Posted in: Django

Oct. 12th, 2011

Android Thread Handling

Almost every Android application has two parts.  On one side is the user interface that the user controls and views the information with.  On the other side is a network component for talking to a server or updating its information.  When you write an Android application you are always given control of the thread that lets you change the user interface.  But, this bit of software processing is only for updating the user interface.  Sure you can use it to make network calls but, this is a big “no no”.  The problem is that the Android application needs to be able to use the UI Thread to make updates, so if you send it off on an errand to retrieve some data from a server, then the application has no way to update the controls.  So essentially the party stops and waits until the network call is complete.  From a users perspective this is awful.  They pushed a button, and now… nothing.  Then suddenly everything is back.

In my experience its find to start creating the application this way but, before you can release it you will need to split the network calls into a separate thread.

This article is about doing just that.

Step 1.  Isolate your network calls with a thread call.

Its assumed at this point that you already have an Android app that is making network calls.  So first go through your code and surround all of them with a block of code that looks like this.

Essentially change this
makeNetworkCall();

into this
new Thread(new Runnable() {
   @Override
   public void run() {
      makeNetworkCall();
   }
}).start();

When you do this, what you are is spinning off a new thread to run out and do the waiting work.  The new problem is, what happens when this new thread is done?

Step 2. The Handler

Behind the scenes is a Java class call the Handler.

This little class can be used build a sort of callback in your main UI Thread so the new threads will have some way to notify it when they are done.  To use it add this bit of code to your Activity class.

private Handler handler = new Handler() {
   @Override
   public void handleMessage(Message msg) {
      updateTheUI(); //This method is for whatever needs to happen after the network call is complete
   }
};

Now anytime you want to actually make a call back to your UI Thread you will simply have to make this call.

handler.sendEmptyMessage(0);
That is the basics but, it leaves a big question.  What if I want to send more information than just update?

Step 3. Smarter Messages

If you want to send a better message than just update, you need to figure out how many types of messages you want and what they should be.  They have to be integer values so it would make the most sense to create a bunch of constants in the class to represent these calls.  For instance…
private static final int REDRAW = 42;

You will then need to modify your handler to look for this particular code.
private Handler handler = new Handler() {
   @Override
   public void handleMessage(Message msg) {
      if(msg.what == REDRAW) {
         updateTheUI(); //This method is for whatever needs to happen after the network call is         	
      }
   }
};

Finally, to call this special method you will need to modify your handler calling code to look like this.
handler.sendEmptyMessage(REDRAW);

In conclusion

Use a handler when you want to spin off new threads too keep the UI Experience responsive.  If you have lots of threads going, then create some custom callback messages.  This will make all of your apps feel instantly responsive.

Posted in: Android, Java, Mobile Development

Oct. 8th, 2011

Part One: Compass and Sass

Problem

As a Ruby on Rails developer, you have a lots of choices for creating visually wonderful websites. And with the introduction of Sass in Rails 3.1, CSS has become easier to code then ever. But, say, your Rails app grows and with it so does Sass and CSS. With such growth, your files expand. You begin distributing Sass and CSS code to different files, files such as header, footer, content, but, all this becomes a messy business. Get the picture? What if you could consolidate each file to, say, just a file or two? Well, with Compass and Sass, you can!

This article deals with the introduction and installation of Compass and Sass and their basic features.

Solution with Compass and Sass

Compass is a generic-framework that heavily uses Sass to generate CSS output. Sass, on the other hand, is a metalanguage that implements programming functions such as “nested rules, variables, mixins, selector inheritance, and more”. Compass and Sass are tightly coupled and are marriage in heaven for web designers: the two eliminate redundancy if you were to code everything in CSS. And, if you are currently implementing CSS3, then you’ll love Compass and Sass. Compass has many great CSS3 functions, such as box-shadow, that saves you so much time typing vendor prefixes.

Compass and Sass Features

  1. Top Level
    1. CSS Rest
    2. Basic Element Styling
    3. Basic Structural Rules
    4. Rules for Browser Inconsistencies
  2. Form and Table Styling
  3. Typography
  4. CSS Grid

Installing

For this example, I am using iMac and its Terminal to generate necessary files and move around file structure. My Ruby and Rails are run on RVM.

With the RVM installed on your computer (and I hope you have RVM installed on your system, if not, take a look at the RVM site for the installation guid), initiate the following command;

gem install compass

However if you are not using RVM, initiate the following command;

sudo gem install compass

Now make sure compass is installed:

compass version

If all went well, you should see compass’s version number as well as its credits.

Creating a Compass Project

To create a Compass project, it’s very simple.

compass create /path/to/project --using blueprint

Notice at the end of the command, I am using the option flag --using and blueprint as the CSS framework for the project. I could have left out both of these and would have ended with the project itself. But I like blueprint, so I am going to leave it the way it is. Oh, and if you wish to have something other then the blueprint, you can. Just make sure you have the right type of framework compatible with Compass. Here are a few for your creative juices. However, it isn’t necessary to have any frameworks. Compass has it’s own CSS library too!

This command also spits out initial file structure Compass has created for you, and it also tell you to place the last part of this message, which is the stylesheets location, in your index.html page. Do that and you are ready to move on to a final step before getting into the application itself. Here is my stylesheets code:

   href="/stylesheets/screen.css" media="screen, projection" rel="stylesheet" type="text/css"
   href="/stylesheets/print.css" media="print" rel="stylesheet" type="text/css"
   href="/stylesheets/ie.css" media="screen, projection" rel="stylesheet" type="text/css"

During this final step is make sure that Compass watches for any changes in your files. In this case, files that end with .scss or .sass. These files will then be converted to CSS (generated into CSS) as the final output! Wow. That’s going to be a lot of CSS! Yes, but it will be a compressed CSS that’s much faster for browser rendering. To do so, issue the following:

compass watch /path/to/project

That’s it! From here onwards, all we need to do is get into one of those .sass or .scss files and start coding.

Next Article Coming Soon

Posted in: Design, Rails, Ruby, Web Design

Sep. 12th, 2011

Software Application Pills

A friend of mine once pointed out that when you are creating software, a very good way to look at it is “how dependent will your users be on your software?”.  Then in his usual clever way, he related it to pills.  Well, not pills so much but, pill like items.   In this regards how do you want to package the software for easy consumption?  How often do you want people to use it?  How hard do you want it to be for your clients to walk away?

It turns out that his metaphor splits nicely into 4 categories; candy, vitamins, medicine and heart meds.

Candy

At the candy level you are talking about fun software.  You are looking to have a huge base of people that might try this out on an impulse.  It’s right there at the check out stand.  You don’t really need it, you could walk away.  But, somehow candy keeps on selling.  It’s a treat, it’s fun, it’s colorful, playful, and something new.  There are lots of apps like this.  The whole game industry is built around candy apps.  In no way, shape, or form do I need to be playing; minecraft, angry birds, or any of these other huge distractions.  But, it is great amusement and enjoyable.  Thus… candy.

Do notice that the price point and volume have to reflect this.  I don’t buy designer candy.  I’m not even sure if that exist.  Candy is cheap, but it makes up for that in volume.   Also think about the life cycle of your relationship with candy.  It’s short, maybe you come back for that brand of candy but, it isn’t something you half to have.

Vitamins

Next come the vitamins.  These are like nuggets of hope and prevention all wrapped into one easy to consume package.  At the root, why do you take vitamins?  It’s a type of insurance right?  It keeps you from getting sick, or helps you get stronger.  These are your anti-virus software, your backup software.  They are a little investment of time now for a lot less problems later.

Now also think about the volume of these sales.  People take vitamins regularly.  Well… sort of. They don’t have to take them so sometimes you skip, or you come back to it later.  But, the point is to do something regularly.  It is more expensive than candy but, you don’t eat a bunch of vitamins like you would candy.  The same goes for software.  You only need a couple of these apps.  You probably don’t try out anti-virus software packages like you would games.  You pick one and stick with it.  Same sort of mentality.

Medicine

You are even more dependent on medicines than vitamins.  You might choose to not take a vitamin but you feel foolish not taking your medicine.  There is software like this as well.  Its core to your business.  Software that keeps things moving.  You could get rid of it but, it would be a bad idea and you would need to find an equivalent.  This is software you don’t want to ignore.

Heart Meds

Finally there are heart meds.  These are a type of medicine you have to take every day or you might die.  That makes them very important to you.  These are expensive, you plan around them, you budget for them, you make sure that at all cost this is available.  Some software is like this as well.  In many industries there is that one vendor or programmer that the entire venture hinges on.  This is what is most important to you.  This isn’t the same as an off the shelf equivalent, but more like a custom software package that is integrated into the company.  The point is dependency.  With a heart med type software your users must have it.  Usually these are older pieces of software that are well wedged into the system. Think about salesforce applications or business process applications.

Conclusion

So when you are thinking about creating software, it might help to think about how dependent you want your users to be.  Do you want the high volume transient sales of candy, or the low volume pricey heart meds.  You need to price and plan for this accordingly.  People don’t buy super expensive candy.  So why would you make a game to sell for $100?

Also consider that with this comes the development cost and the cost to get these introduced to new customers.  For instance, I bet it is much easier to make and sell candy than it is to sell designer drugs.  It’s just a hunch.

Posted in: Software Development

Jul. 21st, 2011

VMWare and Ubuntu High Load issues

Intro

Recently I was involved in moving a large website from Slicehost to an ASP (Application Service Provider) closer to the client. Everything went well, until DNS was updated. Load on the server was high all the time and would increase to a point where web pages wouldn’t be served any more.

Comparison

Slicehost and the ASP provided essentially the same server setup, as per our request:

Slicehost ASP
# CPUs 4 4
Ram 3 gig 3 gig
OS Ubuntu Lucid Debian 6.0
VM Xen VMWare ESX

The website contains over 50 gig of images with sizes up to 1.5 meg. All the pages are dynamically created with PHP connected to mySQL. There is a lightly used Ruby app also running via mod_passenger. On an hourly basis a php script is run to fetch an XML file which is then parsed and imported into the database. None of this is very complicated, and there are numerous points of optimization that could be done.

Statistics

After the move, the server load would range between 6 and 25. Once the server hit 15, the websites would become start to become unavailable. But ‘top’ was only showing around 20% overall CPU usage!

So I started toying with the standard stat programs:

# mpstat 1 20
Linux 2.6.32-5-686 (pmm) 	07/13/2011 	_i686_	(4 CPU)
01:21:45 PM  CPU  %usr %nice  %sys %iowait %irq %soft %steal %guest %idle
01:21:46 PM  all  0.27  0.00  0.53  72.27  0.00  0.00  0.00   0.00  26.93
01:21:47 PM  all  0.73  0.00  0.49  72.37  0.00  0.00  0.00   0.00  26.41
01:21:48 PM  all  4.49  0.00  0.47  31.68  0.00  0.00  0.00   0.00  63.36
01:21:49 PM  all  8.15  0.00  0.99  42.22  0.00  0.25  0.00   0.00  48.40
01:21:50 PM  all 11.86  0.00  0.77  41.49  0.00  0.26  0.00   0.00  45.62
01:21:51 PM  all  7.07  0.00  0.25  38.38  0.00  0.00  0.00   0.00  54.29
01:21:52 PM  all  6.93  0.00  0.80  51.20  0.00  0.00  0.00   0.00  41.07
01:21:53 PM  all  6.83  0.00  0.68  35.54  0.00  0.00  0.00   0.00  56.95
Average:     all  8.94  0.00  0.57  41.86  0.01  0.04  0.00   0.00  48.57

#w
13:22:05 up 48 days, 12:43,  2 users,  load average: 12.00, 11.50, 10.10

CPU %usage from mpstat matched what top was indicating, but the %iowait was very high. To me, this means the processors are waiting for data to work on.


# iostat (edited)
Device: rrqm/s    wrqm/s     r/s       w/s     rsec/s    wsec/s  avgrq-sz  avgqu-sz     await     svctm     %util
sda      0.00     25.00      0.00     10.00      0.00    280.00     28.00      0.04      4.40      1.20      1.20
sda      0.00      0.00      1.00      0.00      8.00      0.00      8.00      0.00      4.00      4.00      0.40
sda      0.00     20.00      3.00      9.00     72.00    232.00     25.33      0.06      5.33      2.00      2.40
sda      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
sda      0.00      0.00      1.00      0.00      8.00      0.00      8.00      0.01      8.00      8.00      0.80
--- import starts ---
sda      1.00   1216.00     11.00    135.00    984.00  10808.00     80.77     20.06    137.37      2.93     42.80
sda      0.00    137.00     65.00    973.00   1168.00   8880.00      9.68      0.95      0.92      0.76     79.20
sda      0.00    148.00     81.00   1414.00    704.00  12496.00      8.83      0.93      0.62      0.55     82.00
sda      1.00    165.00     42.00   1468.00    432.00  13064.00      8.94      1.04      0.69      0.55     83.20
sda      0.00    160.00     24.00   1125.00    192.00  13136.00     11.60      0.84      0.73      0.69     78.80
sda      0.00    148.00     56.00   1571.00    448.00  13752.00      8.73      0.71      0.44      0.44     70.80
sda      0.00    172.00     43.00   1327.00    376.00  11992.00      9.03      0.79      0.58      0.56     76.80
sda      0.00    142.00     55.00   1306.00    464.00  11584.00      8.85      0.82      0.61      0.53     72.00

Wow! There appears to be a lot of disk I/O going on here. What it really means, I’m not sure.

# iotop (recreated since I don't have a historical version)
Total DISK READ: 267.11 K/s | Total DISK WRITE: 0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                           
20876 be/4 mysql       0.00 B/s   15.71 K/s  0.00 %  68.00 % mysqld --basedir=/usr --da~ld/mysqld.sock --port=3306
28051 be/4 www-data  129.63 K/s    0.00 B/s  0.00 %  40.44 % apache2 -k start
28061 be/4 www-data  137.48 K/s    0.00 B/s  0.00 %  31.12 % apache2 -k start
28059 be/4 www-data    0.00 B/s    3.93 K/s  0.00 %  23.00 % apache2 -k start

Once again, lots of I/O going on. It is interesting that apache requires so much I/O for reading. I expected mySQL to require much more I/O since it is reading AND writing to the database.

# apache2ctl status (edited)
                       Apache Server Status for localhost
   --------------------------------------------------------------------------
   CPU Usage: u319.69 s97.34 cu.05 cs0 - 4.88% CPU load
   36 requests currently being processed, 14 idle workers

 _WWC_W_W_WWWW_WLW_CWCW_W_WW______WC.............................
 ................................................................
 ................................................................
 ................................................................

   Scoreboard Key:
   "_" Waiting for Connection, "S" Starting up, "R" Reading Request,
   "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
   "C" Closing connection, "L" Logging, "G" Gracefully finishing,
   "I" Idle cleanup of worker, "." Open slot with no current process

Apache doesn’t think it is using much load, but there are a lot of open connections.

Watching ‘top’ over time, I noted that the import process was consuming 99% of one of the 4 cpus. The other cpus were generally 98% idle, but apache requests would occasionally use between 5-12% of each cpu. I knew the import acted this way beforehand, and it wasn’t a problem on the Slicehost server. So that wasn’t the root of the problem.

Actions Taken

Apache

I turned ‘KeepAlive off’ thinking each ‘client browser’ was hogging an apache process, and eventually all the available forked processes would become blocked. My thinking was that the high I/O was being caused because the images couldn’t be read from disk and passed out fast enough. So open apache processes were constantly waiting for data to be sent.

I set a specific number of apache processes to run according to the available memory left over from mySQL. This has the effect that once memory is assigned to the apache forked process it doesn’t go away. I also set the number of Ruby threads to a very low number to also help control memory usage.

These changes gave a little breathing room when load increased above 15, but the problem was by no means solved.

mySQL

I tweaked mySQL to better utilize all the remaining memory, thinking the import process was making mySQL work harder than the default settings could handle. I switched the db engine from myisam to innodb. That appeared to make the webpages appear faster, but the import was taking 20% longer!

I ran some benchmark tests that I found on the document titled Virtualization for MySQL on VMware

This document uses sysbench to run a test using mySQL and a temporary database. I didn’t delve into the details, but wanted to see a rudimentary comparison of the engines and how the test would affect server load.

An example command-line usage:

# ./sysbench  
--num-threads=4  
--max-time=900  
--max-requests=50000  
--test=oltp  
--mysql-user=root  
–mysql-host=localhost 
--mysql-port=3306  
--mysql-table-engine=innodb  
--oltp-test-mode=complex –oltp-table-size=8000000 run > 
MYRESULTS.txt 

I tested myisam and innodb. The test results were interesting by themselves. When I ran the tests, each of which took 5 minutes, the load was around 2-4, but the load did NOT increase while the tests were running!

Even though the tests indicated that innodb is faster by almost a factor of 2, it was slowing down the import process, so we switched back to myisam.

System

Someone pointed out that a 32-bit OS was installed. Ugh, not much can be done about that, and I don’t believe it would cause problems of this proportion.

Then someone found the following Article that suggests changing the IO Scheduler from cfq to noop. Huh?

It turns out to be very easy. All you have to do is run the following command:

# echo noop > /sys/block/sda/queue/scheduler

And… load has stabilized and been where we expect it to be.

To make this persist after a reboot in Debian, just modify the ‘GRUB_CMDLINE_LINUX’ line in /etc/default/grub.cfg to be:

GRUB_CMDLINE_LINUX="elevator=noop"

Then run:
# update-grub

To make this persist after a reboot with RedHat/CentOS just add “elevator=noop” to the end of your kernel line in /boot/grub/menu.lst:

title CentOS (2.6.18-238.12.1.el5xen)
        root (hd0,0)
        kernel /xen.gz-2.6.18-238.12.1.el5 dom0_mem=256M elevator=noop
        module /vmlinuz-2.6.18-238.12.1.el5xen ro root=/dev/VolGroup00/LogVol00
        module /initrd-2.6.18-238.12.1.el5xen.img

Resolution

It turns out the OS has a scheduler that manages low level I/O operations. The default scheduler, Completely Fair Queuing (CFQ), tries to distribute the available I/O bandwidth equally among all I/O requests. This is great if all those requests are being sent to one piece of hardware in the local machine that handles all I/O. In our case this server is using a Network Appliance connected via FiberChannel, which should be fast enough to handle anything we throw at it. I didn’t find this out until late into the troubleshooting phase.

The NOOP scheduler assumes whatever data is sent to the I/O device will be handled by that device in the most efficient manner. Most of the links I’ve found reference NOOP with SSD drives, but it also makes complete sense once you factor in our particular setup, because the Network Appliance is much faster than any local drive could be.

Conclusion

The statistics pointed to an I/O issue, and that was in fact the root of the problem. As for knowing about such a setting, I can only say that is something that you just learn from experience. In the troubleshooting process, we also found other things that needed be tweaked and pointed out things to consider as this site grows.

Posted in: Systems, Virtual Machine