WAeUP Data Center ***************** The WAeUP data center cares for managing CSV files and importing then. :Test-Layer: unit Creating a data center ====================== A data center can be created easily: >>> from waeup.datacenter import DataCenter >>> mydatacenter = DataCenter() >>> mydatacenter Each data center has a location in file system where files are stored: >>> storagepath = mydatacenter.storage >>> storagepath '/.../src/waeup/files' Managing the storage path ------------------------- We can set another storage path: >>> import os >>> os.mkdir('newlocation') >>> newpath = os.path.abspath('newlocation') >>> mydatacenter.setStoragePath(newpath) [] The result here is a list of filenames, that could not be copied. Luckily, this list is empty. When we set a new storage path, we can tell to move all files in the old location to the new one. To see this feature in action, we first have to put a file into the old location: >>> open(os.path.join(newpath, 'myfile.txt'), 'wb').write('hello') Now we can set a new location and the file will be copied: >>> verynewpath = os.path.abspath('verynewlocation') >>> os.mkdir(verynewpath) >>> mydatacenter.setStoragePath(verynewpath, move=True) [] >>> storagepath = mydatacenter.storage >>> 'myfile.txt' in os.listdir(verynewpath) True We remove the created file to have a clean testing environment for upcoming examples: >>> os.unlink(os.path.join(storagepath, 'myfile.txt')) Uploading files =============== We can get a list of files stored in that location: >>> mydatacenter.getFiles() [] Let's put some file in the storage: >>> import os >>> filepath = os.path.join(storagepath, 'data.csv') >>> open(filepath, 'wb').write('Some Content\n') Now we can find a file: >>> mydatacenter.getFiles() [] As we can see, the actual file is wrapped by a convenience wrapper, that enables us to fetch some data about the file. The data returned is formatted in strings, so that it can easily be put into output pages: >>> datafile = mydatacenter.getFiles()[0] >>> datafile.getSize() '13 bytes' >>> datafile.getDate() # Nearly current datetime... '...' Clean up: >>> import shutil >>> shutil.rmtree(newpath) >>> shutil.rmtree(verynewpath) Handling imports ================ Data centers can find objects ready for CSV imports and associate appropriate importers with them. Getting importers ----------------- To do so, data centers look up their parents for the nearest ancestor, that implements `ICSVDataReceivers` and grab all attributes, that provide some importer. We therefore have to setup a proper scenario first. We start by creating a simple thing that is ready for receiving CSV data: >>> class MyCSVReceiver(object): ... pass Then we create a container for such a CSV receiver: >>> import grok >>> from waeup.interfaces import ICSVDataReceivers >>> from waeup.datacenter import DataCenter >>> class SomeContainer(grok.Container): ... grok.implements(ICSVDataReceivers) ... def __init__(self): ... self.some_receiver = MyCSVReceiver() ... self.other_receiver = MyCSVReceiver() ... self.datacenter = DataCenter() By implementing `ICSVDataReceivers`, a pure marker interface, we indicate, that we want instances of this class to be searched for CSV receivers. This root container has two CSV receivers. The datacenter is also an attribute of our root container. Before we can go into action, we also need an importer, that is able to import data into instances of MyCSVReceiver: >>> from waeup.csvfile.interfaces import ICSVFile >>> from waeup.interfaces import IWAeUPCSVImporter >>> from waeup.utils.importexport import CSVImporter >>> class MyCSVImporter(CSVImporter): ... grok.adapts(ICSVFile, MyCSVReceiver) ... grok.provides(IWAeUPCSVImporter) ... datatype = u'My Stuff' ... def doImport(self, filepath, clear_old_data=True, ... overwrite=True): ... print "Data imported!" We grok the components to get the importer (which is actually an adapter) registered with the component architechture: >>> grok.testing.grok('waeup') >>> grok.testing.grok_component('MyCSVImporter', MyCSVImporter) True Now we can create an instance of `SomeContainer`: >>> mycontainer = SomeContainer() As we are not creating real sites and the objects are 'placeless' from the ZODB point of view, we fake a location by telling the datacenter, that its parent is the container: >>> mycontainer.datacenter.__parent__ = mycontainer >>> datacenter = mycontainer.datacenter When a datacenter is stored in the ZODB, this step will happen automatically. Before we can go on, we have to set a usable path where we can store files without doing harm: >>> os.mkdir('filestore') >>> filestore = os.path.abspath('filestore') >>> datacenter.setStoragePath(filestore) [] Furthermore we must create a file for possible import, as we will get only importers, for which also an importable file is available: >>> import os >>> filepath = os.path.join(datacenter.storage, 'mydata.csv') >>> open(filepath, 'wb').write("""col1,col2 ... 'ATerm','Something' ... """) The datacenter is now able to find the CSV receivers in its parents: >>> datacenter.getImporters() [, ] Imports with the WAeUP portal ----------------------------- The examples above looks complicated, but this is the price for modularity. If you create a new container type, you can define an importer and it will be used automatically by other components. In the WAeUP portal the only component that actually provides CSV data importables is the `University` object. Getting imports (not: importers) -------------------------------- We can get 'imports': >>> datacenter.getPossibleImports() [(<...DataCenterFile object at 0x...>, [(, '...'), (, '...')])] As we can see, an import is defined here as a tuple of a DataCenterFile and a list of available importers with an associated data receiver (the thing where the data should go to). The data receiver is given as an ZODB object id (if the data receiver is persistent) or a simple id (if it is not). Clean up: >>> import shutil >>> shutil.rmtree(filestore) Data center helpers =================== Data centers provide several helper methods to make their usage more convenient. Receivers and receiver ids -------------------------- As already mentioned above, imports are defined as triples containing * a file to import, * an importer to do the import and * an object, which should be updated by the data file. The latter normally is some kind of container, like a faculty container or similar. This is what we call a ``receiver`` as it receives the data from the file via the importer. The datacenter finds receivers by looking up its parents for a component, that implements `ICSVDataReceivers` and scanning that component for attributes, that can be adapted to `ICSVImporter`. I.e., once found an `ICSVDataReceiver` parent, the datacenter gets all importers that can be applied to attributes of this component. For each attribute there can be at most one importer. When building the importer list for a certain file, we also check, that the headers of the file comply with what the respective importers expect. So, if a file contains broken headers, the file won't be offered for import at all. The contexts of the found importers then build our list of available receivers. This means also, that for each receiver provided by the datacenter, there is also an importer available. If for a potential receiver no importer can be found, this receiver will be skipped. As one type of importer might be able to serve several receivers, we also have to provide a unique id for each receiver. This is, where ``receiver ids`` come into play. Receiver ids of objects are determined as * the ZODB oid of the object if the object is persistent * the result of id(obj) otherwise. The value won this way is a long integer which we turn into a string. If the value was get from the ZODB oid, we also prepend it with a ``z`` to avoid any clash with non-ZODB objects (they might deliver the same id, although this is *very* unlikely).