Tuesday, July 5, 2011

=?iso-8859-1?Q?Interpret_X-ray_images ?=

Get Started







===================================================================================================================================================== truqtural makes it easy to get data into a database, and easy to work with it once it's there. Some of the big features include: * Automatically generate all tables and relations needed to represent XML in a database, including one to one, one to many, many to one and many to many relationships (JSON and YAML support is planned for a future release), returning mapped objects. * Automatically generate python objects for a specified table in a database, with support for discovery of all types of relationships (sort of like SQL Soup on 4 different kinds of steroids). * Automatically create persisted representations of python objects in a database, along with a persistent version of the object class. * Automatically infer SQL data types and create table representations for delimited text files (i.e. CSV/TSV), returning mapped objects. * Easily generate and query EAV/vertical attribute tables. * Easily generate and query graphs/trees/directed acyclic graphs. * Easily manage session configuration and creation for multiple databases. * Easily override almost all behavior with custom code or configuration variables * And much more... Once you're up and running, it's SQL Alchemy under the hood, with a few tune-ups that you are free to bypass if you don't like them. Home page: http://www.turnkey-analytics.com/struqtural/ PyPI: http://pypi.python.org/pypi/Struqtural Source: https://code.launchpad.net/pydatastep Bugs: https://bugs.launchpad.net/pydatastep To see Struqtural in action, take a look at the tutorial... ================================================= Struqtural has been designed to be as smart as possible. Because of that, most common use cases are retardedly simple. LetÂ's take a look, shall we? First, letÂ's just examine the most basic use case: from struqtural.structures.structure import Structure from struqtural.loaders import loader data = [{"A":True, "B":1, "C":1.5, "D":"a"}, ... {"A":False, "B":2, "C":2.5, "D":"b"}, ... {"A":True, "B":3, "C":3.5, "D":"c"}, ... {"A":False, "B":4, "C":4.5, "D":"d"}] example_structure = Structure(loader.CollectionLoader, data, "B", "SimpleInstance") print "primary keys:", example_structure.primary_keys primary keys: ['B'] # Note that you could have also specified the primary key using an iterable print "table name:", example_structure.table table name: SimpleInstances print "table columns:", example_structure.columns table columns: ['SimpleInstances.A', 'SimpleInstances.C', 'SimpleInstances.B', 'SimpleInstances.D'] for true_instance in example_structure.filter("B>2"): ... print true_instance SimpleInstance(A=True, C=3.5, B=3, D="c") SimpleInstance(A=False, C=4.5, B=4, D="d") for small_true_instance in example_structure.query().filter("A", "C<3"): ... print small_true_instance SimpleInstance(A=True, C=1.5, B=1, D="a") Another nice feature is that Struqtural is pretty flexible about how you pass it data: from struqtural.structures.structure import Structure from struqtural.loaders import loader more_data = [["T", "1", "1.0"], ... ["FALSE", "2", "2.0"], ... ["true", "3", "3.0"], ... ["False", "4", "4.0"]] # Note that more_data is strings, which isn't uncommon in the real world data_headers = ["A", "B", "E"] categories = Structure(loader.CollectionLoader, (data_headers, more_data), ["B", "E"], "Category") # Just a quick aside to demonstrate that Struqtural gets your table names right print "table name:", categories.table table name: Categories for category in categories: ... print category ... Category(A=True, B=1, E=1.0) Category(A=False, B=2, E=2.0) Category(A=True, B=3, E=3.0) Category(A=False, B=4, E=4.0) As you can see the strings have been handled elegantly. Type conversion is completely controllable and extensible. A small collection of useful type inference and conversion functions have been included. How about if we want to create a new structure out of multiple previously existing ones: example_structure = Structure(loader.CollectionLoader, data, "B", "SimpleInstance") connector = example_structure.connector categories = Structure(loader.CollectionLoader, (data_headers, more_data), ["B", "E"], "Category", connector) joint_structure = Structure(loader.JoinLoader, categories, example_structure) for instance in joint_structure: ... print instance Category(A=True, B=1, E=1.0, C=1.5, D="a") Category(A=False, B=2, E=2.0, C=2.5, D="b") Category(A=True, B=3, E=3.0, C=3.5, D="c") Category(A=False, B=4, E=4.0, C=4.5, D="d") Note that we took advantage of some intelligence on the join we did there, since we have identically named, identically typed columns in the two tables. If you needed to be explicit about it you could just as easily have called structure like so (where the dictionary key is the column on the left table, in this case categories, and the value is the column on the joined table): joined_with = (example_structure, {'A':'A', 'B':'B'}) joint_structure = Structure(loader.JoinLoader, categories, joined_with) Next, letÂ's create persisted versions of pre-existing objects: from struqtural.structures.structure import Structure from struqtural.loaders import loader class Mock(object): ... def __init__(self, a, b, c): ... self.a = a ... self.b = b ... self.c = c mocks = [Mock(1, 2, 3), Mock(4, 5, 6), Mock(7, 8, 9)] another_example_structure = Structure(loader.ObjectLoader, mocks, "a") for mock in another_example_structure.query(): ... print mock Mock(a=1, c=3, b=2) Mock(a=4, c=6, b=5) Mock(a=7, c=9, b=8) Pretty easy, right? Notice that you got a nice __repr__ for free there, too. Any type which can be converted to a known SQL primitive will be mapped automatically by default, though this is easily configurable. If a variable cannot be directly mapped, or more than one type occurs on for a given variable name, by default it will be pickled so that what comes off the database is identical to what went in, though this can also be easily disabled. Note This does not directly instrument pre-existing objects. It populates the database with the data of the objects. Any changes made to the original Mock objects in the above example will not be persisted. I suggest replacing the original objects with the new objects from the session immediately. Also, rather than modify the original class, a modified subclass is created, so it is advisable to save the modified class as well. This was a design decision, however it would be trivial to switch, so the behavior may change depending on user feedback. Now on to something a little bit more complicated... Struqtural can automatically generate a schema from XML. As a basic example, letÂ's use the following document: 123 10/29/00 007 Bond, Inc. 1 3 ABC 12.95 What happens when you give this document to Struqtural? LetÂ's take a look: from struqtural.structures.structure import MultiStructure from struqtural.loaders import loader, file as file_ multi = MultiStructure(file_.XmlMultiLoader, fake_file_1) order_structure = multi.properties["Order"] for order in order_structure.query(): ... print "Order number:", order.number ... print "Order date:", order.date ... print "Order customer count:", len(order.customers) ... for customer in order.customers: ... print "Order customer:", customer ... print "Order line count:", len(order.lines) ... for line in order.lines: ... print "Order line:", line ... print "Order line part count:", len(line.parts) ... for part in line.parts: ... print "Order line part:", part ... Order number: 123 Order date: 2000-10-29 Order customer count: 1 Order customer: Customer(order_id=1, custnum=7, name="Bond, Inc.", id=1) Order line count: 1 Order line: Line(order_id=1, linenum=1, id=1, quantity=3) Order line part count: 1 Order line part: Part(line_id=1, price=12.95, id=1, partnum="ABC") Ok, Struqtural automatically figured out all the relations and properties from the XML file, thatÂ's pretty neat. Because we didnÂ't set any tag primary keys or provide a SchemaFormatter object with a primary key identification engine, an id key is automatically added. Since this behavior is typically not desirable, it is very easy to override. Note You will probably notice that our customer has an order_id... In this case, that customer appears only on the one Order, so that is not too odd. Had the customer appeared on multiple orders, the order would have had a customer_id. Even better, if multiple customers appeared on any order, a secondary table would be automatically created to handle the relationship, as weÂ'll demonstrate in a minute. So Struqtural can get data into a database easily, you say. What if my data is already in the database? What can you do for me? No problem, check it out: from struqtural.structures.structure import MultiStructure from struqtural.loaders import loader, file as file_ multi = MultiStructure(file_.XmlMultiLoader, fake_file_1) order_structure = multi.properties["Order"] connector = order_structure.connector # This isn't strictly needed, but gives us nice, informative names. # The key in names is the table name, the value is the name you would like to give # the instance you are creating to represent that table. names = {"Customers":"Customer", "Lines":"Line", "Parts":"Part"} multi = MultiStructure(loader.DatabaseMultiLoader, "Orders", connector, names) order_structure = multi.properties["Orders"] for order in order_structure.query(): ... print "Order number:", order.number ... print "Order date:", order.date ... print "Order customer count:", len(order.customers) ... for customer in order.customers: ... print "Order customer:", customer ... print "Order line count:", len(order.lines) ... for line in order.lines: ... print "Order line:", line ... print "Order line part count:", len(line.parts) ... for part in line.parts: ... print "Order line part:", part Order number: 123 Order date: 2000-10-29 Order customer count: 1 Order customer: Customer(order_id=1, custnum=7, name="Bond, Inc.", id=1) Order line count: 1 Order line: Line(order_id=1, linenum=1, id=1, quantity=3) Order line part count: 1 Order line part: Part(line_id=1, price=12.95, partnum="ABC", id=1) You just got the Â"OrdersÂ" table and everything connected with it (including stuff that points to it as well as stuff it points to) loaded and mapped, and it took almost no work! YouÂ're free to do interesting stuff! Pretty awesome, huh? LetÂ's push things to the edge now with a quick demo of many to many relationship support. For this example weÂ're going to be using the following XML: 123 Sales 143 Raul Lopez 687 John Smith 947 Ming Chu 456 Marketing 157 Jim Jones 687 John Smith 947 Ming Chu Note that weÂ're also going to be passing tag primary keys here, so that things work properly: from struqtural.structures.structure import MultiStructure from struqtural.loaders import loader, file as file_ primary_keys = {"Department":"DeptNum", "Employee":"Number"} multi = MultiStructure(file_.XmlMultiLoader, fake_file_2, ... primary_keys=primary_keys) department_structure = multi.properties["Department"] for department in department_structure.query(): ... print "Department number:", department.deptnum ... print "Employee count:", len(department.employees) ... for employee in department.employees: ... print "Employee name:", employee.name ... print "Employee number:", employee.number ... Department number: 123 Employee count: 3 Employee name: Raul Lopez Employee number: 143 Employee name: John Smith Employee number: 687 Employee name: Ming Chu Employee number: 947 Department number: 456 Employee count: 3 Employee name: Jim Jones Employee number: 157 Employee name: John Smith Employee number: 687 Employee name: Ming Chu Employee number: 947 If you already have a database with many to many relationships, that is no problem. As long as the secondary tables for the relationship donÂ't contain any non-foreign key columns or foreign keys for more than two tables, youÂ're in good shape. For example: from struqtural.structures.structure import MultiStructure from struqtural.loaders import loader, file as file_ primary_keys = {"Department":"DeptNum", "Employee":"Number"} multi = MultiStructure(file_.XmlMultiLoader, fake_file_2, ... primary_keys=primary_keys) department_structure = multi.properties["Department"] connector = department_structure.connector names = {"Departments":"Department", "Employees":"Employee"} multi = MultiStructure(loader.DatabaseMultiLoader, "Departments", connector, ... names, backref_by_default=True) departments = multi.properties["Departments"] for department in department_structure.query(): ... print "Department number:", department.deptnum ... print "Employee count:", len(department.employees) ... for employee in department.employees: ... print "Employee name:", employee.name ... print "Employee number:", employee.number ... Department number: 123 Employee count: 3 Employee name: Raul Lopez Employee number: 143 Employee name: John Smith Employee number: 687 Employee name: Ming Chu Employee number: 947 Department number: 456 Employee count: 3 Employee name: Jim Jones Employee number: 157 Employee name: John Smith Employee number: 687 Employee name: Ming Chu Employee number: 947 You will note that in all these examples we used a memory resident SQLite database. If you want to use a real database, you will need to pass a connector to your structure. You create one like so: from struqtural.database import DatabaseConnector connector = DatabaseConnector("my_database_config.cfg") ThatÂ's it! In fact, Struqtural will even automatically look in a few logical places for your configuration information, so you can set it up once and youÂ'll never have to pass it the location of a configuration file again. Just to give you an example of the format for the configuration file (this is included in the resources subdirectory): [current engine] engine= [parameter values] user= pass= host= db= [parameter lists] mysql=mysql://{user}:{pass}@{host}/{db} postgresql=postgresql://{user}:{pass}@{host}/{db} Note The configuration file shown here does not include driver, port or other information that you would probably want to be able to specify in a real case. Those parameters are supported, so if you modify the configuration to include them, the struqtural.database.connector.Databaseconnector will work as expected. -- http://mail.python.org/mailman/listinfo/python-announce-list Support the Python Software Foundation: http://www.python.org/psf/donations/