source: waeup/branches/ulif-rewrite/src/waeup/datacenter.txt @ 4232

Last change on this file since 4232 was 4225, checked in by uli, 15 years ago

Update tests.

File size: 8.1 KB
Line 
1WAeUP Data Center
2*****************
3
4The WAeUP data center cares for managing CSV files and importing then.
5
6:Test-Layer: unit
7
8Creating a data center
9======================
10
11A data center can be created easily:
12
13    >>> from waeup.datacenter import DataCenter
14    >>> mydatacenter = DataCenter()
15    >>> mydatacenter
16    <waeup.datacenter.DataCenter object at 0x...>
17
18Each data center has a location in file system where files are stored:
19
20    >>> storagepath = mydatacenter.storage
21    >>> storagepath
22    '/.../src/waeup/files'
23
24
25Managing the storage path
26-------------------------
27
28We can set another storage path:
29
30    >>> import os
31    >>> os.mkdir('newlocation')
32    >>> newpath = os.path.abspath('newlocation')
33    >>> mydatacenter.setStoragePath(newpath)
34    []
35
36The result here is a list of filenames, that could not be
37copied. Luckily, this list is empty.
38
39When we set a new storage path, we can tell to move all files in the
40old location to the new one. To see this feature in action, we first
41have to put a file into the old location:
42
43    >>> open(os.path.join(newpath, 'myfile.txt'), 'wb').write('hello')
44
45Now we can set a new location and the file will be copied:
46
47    >>> verynewpath = os.path.abspath('verynewlocation')
48    >>> os.mkdir(verynewpath)
49
50    >>> mydatacenter.setStoragePath(verynewpath, move=True)
51    []
52
53    >>> storagepath = mydatacenter.storage
54    >>> 'myfile.txt' in os.listdir(verynewpath)
55    True
56
57We remove the created file to have a clean testing environment for
58upcoming examples:
59
60    >>> os.unlink(os.path.join(storagepath, 'myfile.txt'))
61
62Uploading files
63===============
64
65We can get a list of files stored in that location:
66
67    >>> mydatacenter.getFiles()
68    []
69
70Let's put some file in the storage:
71
72    >>> import os
73    >>> filepath = os.path.join(storagepath, 'data.csv')
74    >>> open(filepath, 'wb').write('Some Content\n')
75
76Now we can find a file:
77
78    >>> mydatacenter.getFiles()
79    [<waeup.datacenter.DataCenterFile object at 0x...>]
80
81As we can see, the actual file is wrapped by a convenience wrapper,
82that enables us to fetch some data about the file. The data returned
83is formatted in strings, so that it can easily be put into output
84pages:
85
86    >>> datafile = mydatacenter.getFiles()[0]
87    >>> datafile.getSize()
88    '13 bytes'
89
90    >>> datafile.getDate() # Nearly current datetime...
91    '...'
92
93Clean up:
94
95    >>> import shutil
96    >>> shutil.rmtree(newpath)
97    >>> shutil.rmtree(verynewpath)
98
99
100Handling imports
101================
102
103Data centers can find objects ready for CSV imports and associate
104appropriate importers with them.
105
106Getting importers
107-----------------
108
109To do so, data centers look up their parents for the nearest ancestor,
110that implements `ICSVDataReceivers` and grab all attributes, that
111provide some importer.
112
113We therefore have to setup a proper scenario first.
114
115We start by creating a simple thing that is ready for receiving CSV
116data:
117
118    >>> class MyCSVReceiver(object):
119    ...   pass
120
121Then we create a container for such a CSV receiver:
122
123    >>> import grok
124    >>> from waeup.interfaces import ICSVDataReceivers
125    >>> from waeup.datacenter import DataCenter
126    >>> class SomeContainer(grok.Container):
127    ...   grok.implements(ICSVDataReceivers)
128    ...   def __init__(self):
129    ...     self.some_receiver = MyCSVReceiver()
130    ...     self.other_receiver = MyCSVReceiver()
131    ...     self.datacenter = DataCenter()
132
133By implementing `ICSVDataReceivers`, a pure marker interface, we
134indicate, that we want instances of this class to be searched for CSV
135receivers.
136
137This root container has two CSV receivers.
138
139The datacenter is also an attribute of our root container.
140
141Before we can go into action, we also need an importer, that is able
142to import data into instances of MyCSVReceiver:
143
144    >>> from waeup.csvfile.interfaces import ICSVFile
145    >>> from waeup.interfaces import IWAeUPCSVImporter
146    >>> from waeup.utils.importexport import CSVImporter
147    >>> class MyCSVImporter(CSVImporter):
148    ...   grok.adapts(ICSVFile, MyCSVReceiver)
149    ...   grok.provides(IWAeUPCSVImporter)
150    ...   datatype = u'My Stuff'
151    ...   def doImport(self, filepath, clear_old_data=True,
152    ...                                overwrite=True):
153    ...     print "Data imported!"
154
155We grok the components to get the importer (which is actually an
156adapter) registered with the component architechture:
157
158    >>> grok.testing.grok('waeup')
159    >>> grok.testing.grok_component('MyCSVImporter', MyCSVImporter)
160    True
161
162Now we can create an instance of `SomeContainer`:
163
164    >>> mycontainer = SomeContainer()
165
166As we are not creating real sites and the objects are 'placeless' from
167the ZODB point of view, we fake a location by telling the datacenter,
168that its parent is the container:
169
170    >>> mycontainer.datacenter.__parent__ = mycontainer
171    >>> datacenter = mycontainer.datacenter
172
173When a datacenter is stored in the ZODB, this step will happen
174automatically.
175
176The datacenter is now able to find the CSV receivers in its parents:
177
178    >>> datacenter.getImporters()
179    [<MyCSVImporter object at 0x...>, <MyCSVImporter object at 0x...>]
180
181
182Imports with the WAeUP portal
183-----------------------------
184
185The examples above looks complicated, but this is the price for
186modularity. If you create a new container type, you can define an
187importer and it will be used automatically by other components.
188
189In the WAeUP portal the only component that actually provides CSV data
190importables is the `University` object.
191
192
193Getting imports (not: importers)
194--------------------------------
195
196We can get 'imports'. For this, we need a CSV file to import:
197
198    >>> import os
199    >>> filepath = os.path.join(datacenter.storage, 'mydata.csv')
200    >>> open(filepath, 'wb').write("""col1,col2
201    ... 'ATerm','Something'
202    ... """)
203
204    >>> datacenter.getPossibleImports()
205    [(<...DataCenterFile object at 0x...>,
206      [(<MyCSVImporter object at 0x...>, '...'),
207       (<MyCSVImporter object at 0x...>, '...')])]
208
209As we can see, an import is defined here as a tuple of a
210DataCenterFile and a list of available importers with an associated
211data receiver (the thing where the data should go to).
212
213The data receiver is given as an ZODB object id (if the data receiver
214is persistent) or a simple id (if it is not).
215
216Clean up:
217
218    >>> os.unlink(filepath)
219
220
221Data center helpers
222===================
223
224Data centers provide several helper methods to make their usage more
225convenient.
226
227
228Receivers and receiver ids
229--------------------------
230
231As already mentioned above, imports are defined as triples containing
232
233* a file to import,
234
235* an importer to do the import and
236
237* an object, which should be updated by the data file.
238
239The latter normally is some kind of container, like a faculty
240container or similar. This is what we call a ``receiver`` as it
241receives the data from the file via the importer.
242
243The datacenter finds receivers by looking up its parents for a
244component, that implements `ICSVDataReceivers` and scanning that
245component for attributes, that can be adapted to `ICSVImporter`.
246
247I.e., once found an `ICSVDataReceiver` parent, the datacenter gets all
248importers that can be applied to attributes of this component. For
249each attribute there can be at most one importer.
250
251When building the importer list for a certain file, we also check,
252that the headers of the file comply with what the respective importers
253expect. So, if a file contains broken headers, the file won't be
254offered for import at all.
255
256The contexts of the found importers then build our list of available
257receivers. This means also, that for each receiver provided by the
258datacenter, there is also an importer available.
259
260If for a potential receiver no importer can be found, this receiver
261will be skipped.
262
263As one type of importer might be able to serve several receivers, we
264also have to provide a unique id for each receiver. This is, where
265``receiver ids`` come into play.
266
267Receiver ids of objects are determined as
268
269* the ZODB oid of the object if the object is persistent
270
271* the result of id(obj) otherwise.
272
273The value won this way is a long integer which we turn into a
274string. If the value was get from the ZODB oid, we also prepend it
275with a ``z`` to avoid any clash with non-ZODB objects (they might
276deliver the same id, although this is *very* unlikely).
Note: See TracBrowser for help on using the repository browser.