source: waeup/branches/ulif-namespace/src/waeup/sirp/datacenter.txt @ 4797

Last change on this file since 4797 was 4797, checked in by uli, 15 years ago

Fix all tests to reflect new namespace.

File size: 8.5 KB
Line 
1WAeUP Data Center
2*****************
3
4The WAeUP data center cares for managing CSV files and importing then.
5
6:Test-Layer: unit
7
8Creating a data center
9======================
10
11A data center can be created easily:
12
13    >>> from waeup.sirp.datacenter import DataCenter
14    >>> mydatacenter = DataCenter()
15    >>> mydatacenter
16    <waeup.sirp.datacenter.DataCenter object at 0x...>
17
18Each data center has a location in file system where files are stored:
19
20    >>> storagepath = mydatacenter.storage
21    >>> storagepath
22    '/.../src/waeup/sirp/files'
23
24
25Managing the storage path
26-------------------------
27
28We can set another storage path:
29
30    >>> import os
31    >>> os.mkdir('newlocation')
32    >>> newpath = os.path.abspath('newlocation')
33    >>> mydatacenter.setStoragePath(newpath)
34    []
35
36The result here is a list of filenames, that could not be
37copied. Luckily, this list is empty.
38
39When we set a new storage path, we can tell to move all files in the
40old location to the new one. To see this feature in action, we first
41have to put a file into the old location:
42
43    >>> open(os.path.join(newpath, 'myfile.txt'), 'wb').write('hello')
44
45Now we can set a new location and the file will be copied:
46
47    >>> verynewpath = os.path.abspath('verynewlocation')
48    >>> os.mkdir(verynewpath)
49
50    >>> mydatacenter.setStoragePath(verynewpath, move=True)
51    []
52
53    >>> storagepath = mydatacenter.storage
54    >>> 'myfile.txt' in os.listdir(verynewpath)
55    True
56
57We remove the created file to have a clean testing environment for
58upcoming examples:
59
60    >>> os.unlink(os.path.join(storagepath, 'myfile.txt'))
61
62Uploading files
63===============
64
65We can get a list of files stored in that location:
66
67    >>> mydatacenter.getFiles()
68    []
69
70Let's put some file in the storage:
71
72    >>> import os
73    >>> filepath = os.path.join(storagepath, 'data.csv')
74    >>> open(filepath, 'wb').write('Some Content\n')
75
76Now we can find a file:
77
78    >>> mydatacenter.getFiles()
79    [<waeup.sirp.datacenter.DataCenterFile object at 0x...>]
80
81As we can see, the actual file is wrapped by a convenience wrapper,
82that enables us to fetch some data about the file. The data returned
83is formatted in strings, so that it can easily be put into output
84pages:
85
86    >>> datafile = mydatacenter.getFiles()[0]
87    >>> datafile.getSize()
88    '13 bytes'
89
90    >>> datafile.getDate() # Nearly current datetime...
91    '...'
92
93Clean up:
94
95    >>> import shutil
96    >>> shutil.rmtree(newpath)
97    >>> shutil.rmtree(verynewpath)
98
99
100Handling imports
101================
102
103Data centers can find objects ready for CSV imports and associate
104appropriate importers with them.
105
106Getting importers
107-----------------
108
109To do so, data centers look up their parents for the nearest ancestor,
110that implements `ICSVDataReceivers` and grab all attributes, that
111provide some importer.
112
113We therefore have to setup a proper scenario first.
114
115We start by creating a simple thing that is ready for receiving CSV
116data:
117
118    >>> class MyCSVReceiver(object):
119    ...   pass
120
121Then we create a container for such a CSV receiver:
122
123    >>> import grok
124    >>> from waeup.sirp.interfaces import ICSVDataReceivers
125    >>> from waeup.sirp.datacenter import DataCenter
126    >>> class SomeContainer(grok.Container):
127    ...   grok.implements(ICSVDataReceivers)
128    ...   def __init__(self):
129    ...     self.some_receiver = MyCSVReceiver()
130    ...     self.other_receiver = MyCSVReceiver()
131    ...     self.datacenter = DataCenter()
132
133By implementing `ICSVDataReceivers`, a pure marker interface, we
134indicate, that we want instances of this class to be searched for CSV
135receivers.
136
137This root container has two CSV receivers.
138
139The datacenter is also an attribute of our root container.
140
141Before we can go into action, we also need an importer, that is able
142to import data into instances of MyCSVReceiver:
143
144    >>> from waeup.sirp.csvfile.interfaces import ICSVFile
145    >>> from waeup.sirp.interfaces import IWAeUPCSVImporter
146    >>> from waeup.sirp.utils.importexport import CSVImporter
147    >>> class MyCSVImporter(CSVImporter):
148    ...   grok.adapts(ICSVFile, MyCSVReceiver)
149    ...   grok.provides(IWAeUPCSVImporter)
150    ...   datatype = u'My Stuff'
151    ...   def doImport(self, filepath, clear_old_data=True,
152    ...                                overwrite=True):
153    ...     print "Data imported!"
154
155We grok the components to get the importer (which is actually an
156adapter) registered with the component architechture:
157
158    >>> grok.testing.grok('waeup')
159    >>> grok.testing.grok_component('MyCSVImporter', MyCSVImporter)
160    True
161
162Now we can create an instance of `SomeContainer`:
163
164    >>> mycontainer = SomeContainer()
165
166As we are not creating real sites and the objects are 'placeless' from
167the ZODB point of view, we fake a location by telling the datacenter,
168that its parent is the container:
169
170    >>> mycontainer.datacenter.__parent__ = mycontainer
171    >>> datacenter = mycontainer.datacenter
172
173When a datacenter is stored in the ZODB, this step will happen
174automatically.
175
176Before we can go on, we have to set a usable path where we can store
177files without doing harm:
178
179    >>> os.mkdir('filestore')
180    >>> filestore = os.path.abspath('filestore')
181    >>> datacenter.setStoragePath(filestore)
182    []
183
184Furthermore we must create a file for possible import, as we will get
185only importers, for which also an importable file is available:
186
187    >>> import os
188    >>> filepath = os.path.join(datacenter.storage, 'mydata.csv')
189    >>> open(filepath, 'wb').write("""col1,col2
190    ... 'ATerm','Something'
191    ... """)
192
193The datacenter is now able to find the CSV receivers in its parents:
194
195    >>> datacenter.getImporters()
196    [<MyCSVImporter object at 0x...>, <MyCSVImporter object at 0x...>]
197
198
199Imports with the WAeUP portal
200-----------------------------
201
202The examples above looks complicated, but this is the price for
203modularity. If you create a new container type, you can define an
204importer and it will be used automatically by other components.
205
206In the WAeUP portal the only component that actually provides CSV data
207importables is the `University` object.
208
209
210Getting imports (not: importers)
211--------------------------------
212
213We can get 'imports':
214
215    >>> datacenter.getPossibleImports()
216    [(<...DataCenterFile object at 0x...>,
217      [(<MyCSVImporter object at 0x...>, '...'),
218       (<MyCSVImporter object at 0x...>, '...')])]
219
220As we can see, an import is defined here as a tuple of a
221DataCenterFile and a list of available importers with an associated
222data receiver (the thing where the data should go to).
223
224The data receiver is given as an ZODB object id (if the data receiver
225is persistent) or a simple id (if it is not).
226
227Clean up:
228
229    >>> import shutil
230    >>> shutil.rmtree(filestore)
231
232
233Data center helpers
234===================
235
236Data centers provide several helper methods to make their usage more
237convenient.
238
239
240Receivers and receiver ids
241--------------------------
242
243As already mentioned above, imports are defined as triples containing
244
245* a file to import,
246
247* an importer to do the import and
248
249* an object, which should be updated by the data file.
250
251The latter normally is some kind of container, like a faculty
252container or similar. This is what we call a ``receiver`` as it
253receives the data from the file via the importer.
254
255The datacenter finds receivers by looking up its parents for a
256component, that implements `ICSVDataReceivers` and scanning that
257component for attributes, that can be adapted to `ICSVImporter`.
258
259I.e., once found an `ICSVDataReceiver` parent, the datacenter gets all
260importers that can be applied to attributes of this component. For
261each attribute there can be at most one importer.
262
263When building the importer list for a certain file, we also check,
264that the headers of the file comply with what the respective importers
265expect. So, if a file contains broken headers, the file won't be
266offered for import at all.
267
268The contexts of the found importers then build our list of available
269receivers. This means also, that for each receiver provided by the
270datacenter, there is also an importer available.
271
272If for a potential receiver no importer can be found, this receiver
273will be skipped.
274
275As one type of importer might be able to serve several receivers, we
276also have to provide a unique id for each receiver. This is, where
277``receiver ids`` come into play.
278
279Receiver ids of objects are determined as
280
281* the ZODB oid of the object if the object is persistent
282
283* the result of id(obj) otherwise.
284
285The value won this way is a long integer which we turn into a
286string. If the value was get from the ZODB oid, we also prepend it
287with a ``z`` to avoid any clash with non-ZODB objects (they might
288deliver the same id, although this is *very* unlikely).
Note: See TracBrowser for help on using the repository browser.