<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Pythonian &#187; PyPI</title>
	<atom:link href="http://thepythonian.com/tag/pypi/feed/" rel="self" type="application/rss+xml" />
	<link>http://thepythonian.com</link>
	<description>All about the Python Programming Language</description>
	<lastBuildDate>Fri, 11 Jun 2010 13:54:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Python and and Excel Reader, XLRD</title>
		<link>http://thepythonian.com/2009/06/17/python-and-and-excel-reader-xlrd/</link>
		<comments>http://thepythonian.com/2009/06/17/python-and-and-excel-reader-xlrd/#comments</comments>
		<pubDate>Wed, 17 Jun 2009 20:13:50 +0000</pubDate>
		<dc:creator>thePythonian</dc:creator>
				<category><![CDATA[Office]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[OO.org]]></category>
		<category><![CDATA[openoffice]]></category>
		<category><![CDATA[OpenOffice.org]]></category>
		<category><![CDATA[PyPI]]></category>
		<category><![CDATA[xlrd]]></category>
		<category><![CDATA[xls]]></category>

		<guid isPermaLink="false">http://thepythonian.com/?p=20</guid>
		<description><![CDATA[<p>The python module, XLRD, is a developer tool for reading Microsoft Excel formats to 2003 using a library reverse-engineered in OpenOffice.org. The description states it is Unicode-aware, and I would like to add to that it is very, very Unicode-aware meaning you may have issues with files formatted with different language packs.</p>
<p>In my experience version [...]]]></description>
			<content:encoded><![CDATA[<p>The python module, <a title="Library for developers to extract data from Microsoft Excel (tm) spreadsheet files" href="http://pypi.python.org/pypi/xlrd">XLRD</a>, is a developer tool for reading <a title="microsoft excel" href="http://office.microsoft.com/excel">Microsoft Excel</a> formats to 2003 using a library reverse-engineered in <a title="API" href="http://www.lexicon.net/sjmachin/xlrd.html">OpenOffice.org</a>. The description states it is <em>Unicode-aware</em>, and I would like to add to that it is very, very <em>Unicode-aware</em> meaning you may have issues with files formatted with different language packs.</p>
<p>In my experience version 5.2 worked the best for exporting Excel columns and rows to ansi text. As of this writing, the <a title="Python “xlrd” package for extracting data from Excel files" href="http://www.lexicon.net/sjmachin/xlrd.htm">home page</a> needs to be updated, as it still has the latest version as 0.6.1, when <a title="Python Package Index" href="http://pypi.python.org/pypi">PyPI</a> has 0.7.1 as the lastest version. This may have changed as of the writing of this blog, but I do not keep track of the XLRD development. Google is probably the best bet in tracking the latest developments.</p>
<p style="padding-left: 30px;"><em>Just for reference, files ending with .xls are Excel formated files up to Excel 2003, and files ending with .xlsx are Excel files formated with Excel 2007 (pc) and Excel 2008 (apple).</em></p>
<p>Example Usage [via Activestate] : <a title="Recipe 483742: Easy Cross Platform Excel Parsing With Xlrd" href="http://code.activestate.com/recipes/483742/">Recipe 483742: Easy Cross Platform Excel Parsing With Xlrd</a></p>
<blockquote><p>The class allows you to create a generator which returns excel data one row at a time as either a list or dictionary.</p></blockquote>
<p>Below is a simple example I wrote which will only collect data from a column if a condition is met in another column, i.e. the latter column contains positive whole number integers. I am including two functions called <em>before</em> and <em>positivity</em> which may be useful in removing undesirable results from your data, as I had to do in my full implementation. Please be pragmatic when using python modules other than the default modules included in <a title="CPython" href="http://en.wikipedia.org/wiki/CPython">CPython</a> (the core python modules).</p>
<h3>USING XLRD 5.2</h3>
<pre>book = xlrd.open_workbook(parser)
sh = book.sheet_by_name(wb)

for r in range(sh.nrows)[1:]:
    # The item is the field column converted list
    item = sh.row(r)[special_info]

    # Example : To remove the last quote
    before = len(str(item)) - 1

    # Transform ouput to integer only (remove decimals)
    ad = str(sh.row(r)[addcol])

    # Example : To make sure the add column doesn't contain negative number
    positivity = re.search(("[\-\][0-9]*[0-9]"), ad)

    # If not a number, with the prefix identifier "text", then skip
    if re.search('text', ad) :
        pass # Skip line - Found junk data in badly formated Excel Spreadsheet
    else  :
        print final_output</pre>
<p><strong><em>UPDATE: This is in no way an endorsement of XLRD as a solution. </em></strong></p>
<p>I would not recommend building an application off of XLRD if you had a database to use for importing, such as MS SQL or Oracle. I used XLRD as a solution, only because of the lack of a database at the time I started development.</p>
<p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fthepythonian.com%2F2009%2F06%2F17%2Fpython-and-and-excel-reader-xlrd%2F&amp;linkname=Python%20and%20and%20Excel%20Reader%2C%20XLRD"><img src="http://thepythonian.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://thepythonian.com/2009/06/17/python-and-and-excel-reader-xlrd/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
