1 | ## |
---|
2 | ## imagestorage.py |
---|
3 | ## Login : <uli@pu.smp.net> |
---|
4 | ## Started on Mon Jul 4 16:02:14 2011 Uli Fouquet |
---|
5 | ## $Id$ |
---|
6 | ## |
---|
7 | ## Copyright (C) 2011 Uli Fouquet |
---|
8 | ## This program is free software; you can redistribute it and/or modify |
---|
9 | ## it under the terms of the GNU General Public License as published by |
---|
10 | ## the Free Software Foundation; either version 2 of the License, or |
---|
11 | ## (at your option) any later version. |
---|
12 | ## |
---|
13 | ## This program is distributed in the hope that it will be useful, |
---|
14 | ## but WITHOUT ANY WARRANTY; without even the implied warranty of |
---|
15 | ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
---|
16 | ## GNU General Public License for more details. |
---|
17 | ## |
---|
18 | ## You should have received a copy of the GNU General Public License |
---|
19 | ## along with this program; if not, write to the Free Software |
---|
20 | ## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA |
---|
21 | ## |
---|
22 | """A storage for image files. |
---|
23 | |
---|
24 | A few words about storing files with ``waeup.sirp``. The need for this |
---|
25 | feature arised initially from the need to store passport files for |
---|
26 | applicants and students. These files are dynamic (can be changed |
---|
27 | anytime), mean a lot of traffic and cost a lot of memory/disk space. |
---|
28 | |
---|
29 | **Design Basics** |
---|
30 | |
---|
31 | While one *can* store images and similar 'large binary objects' aka |
---|
32 | blobs in the ZODB, this approach quickly becomes cumbersome and |
---|
33 | difficult to understand. The worst approach here would be to store |
---|
34 | images as regular byte-stream objects. ZODB supports this but |
---|
35 | obviously access is slow (data must be looked up in the one |
---|
36 | ``Data.fs`` file, each file has to be sent to the ZEO server and back, |
---|
37 | etc.). |
---|
38 | |
---|
39 | A bit less worse is the approach to store images in the ZODB but as |
---|
40 | Blobs. ZODB supports storing blobs in separate files in order to |
---|
41 | accelerate lookup/retrieval of these files. The files, however, have |
---|
42 | to be sent to the ZEO server (and back on lookups) which means a |
---|
43 | bottleneck and will easily result in an increased number of |
---|
44 | ``ConflictErrors`` even on simple reads. |
---|
45 | |
---|
46 | The advantage of both ZODB-geared approaches is, of course, complete |
---|
47 | database consistency. ZODB will guarantee that your files are |
---|
48 | available under some object name and can be handled as any other |
---|
49 | Python object. |
---|
50 | |
---|
51 | Another approach is to leave the ZODB behind and to store images and |
---|
52 | other files in filesystem directly. This is faster (no ZEO contacts, |
---|
53 | etc.), reduces probability of `ConflictErrors`, keeps the ZODB |
---|
54 | smaller, and enables direct access (over filesystem) to the |
---|
55 | files. Furthermore steps might be better understandable for |
---|
56 | third-party developers. We opted for this last option. |
---|
57 | |
---|
58 | **External File Store** |
---|
59 | |
---|
60 | Our implementation for storing-files-API is defined in |
---|
61 | :class:`ExtFileStore`. An instance of this file storage (which is also |
---|
62 | able to store non-image files) is available at runtime as a global |
---|
63 | utility implementing :class:`waeup.sirp.interfaces.IExtFileStore`. |
---|
64 | |
---|
65 | The main task of this central component is to maintain a filesystem |
---|
66 | root path for all files to be stored. It also provides methods to |
---|
67 | store/get files under certain file ids which identify certain files |
---|
68 | locally. |
---|
69 | |
---|
70 | So, to store a file away, you can do something like this: |
---|
71 | |
---|
72 | >>> from StringIO import StringIO |
---|
73 | >>> from zope.component import getUtility |
---|
74 | >>> from waeup.sirp.interfaces import IExtFileStore |
---|
75 | >>> store = getUtility(IExtFileStore) |
---|
76 | >>> store.createFile('myfile.txt', StringIO('some file content')) |
---|
77 | |
---|
78 | All you need is a filename and the file-like object containing the |
---|
79 | real file data. |
---|
80 | |
---|
81 | This will store the file somewhere (you shouldn't make too much |
---|
82 | assumptions about the real filesystem path here). |
---|
83 | |
---|
84 | Later, we can get the file back like this: |
---|
85 | |
---|
86 | >>> store.getFile('myfile.txt') |
---|
87 | <open file ...> |
---|
88 | |
---|
89 | What we get back is a file or file-like object already opened for |
---|
90 | reading: |
---|
91 | |
---|
92 | >>> store.getFile('myfile.txt').read() |
---|
93 | 'some file content' |
---|
94 | |
---|
95 | **Handlers: Special Places for Special Files** |
---|
96 | |
---|
97 | The file store supports special handling for certain files. For |
---|
98 | example we want applicant images to be stored in a different directory |
---|
99 | than student images, etc. Because the file store cannot know all |
---|
100 | details about these special tratment of certain files, it looks up |
---|
101 | helpers (handlers) to provide the information it needs for really |
---|
102 | storing the files at the correct location. |
---|
103 | |
---|
104 | That a file stored in filestore needs special handling can be |
---|
105 | indicated by special filenames. These filenames start with a marker like |
---|
106 | this:: |
---|
107 | |
---|
108 | __<MARKER-STRING>__real-filename.jpg |
---|
109 | |
---|
110 | Please note the double underscores before and after the marker |
---|
111 | string. They indicate that all in between is a marker. |
---|
112 | |
---|
113 | If you store a file in file store with such a filename (we call this a |
---|
114 | `file_id` to distuingish it from real world filenames), the file store |
---|
115 | will look up a handler for ``<MARKER-STRING>`` and pass it the file to |
---|
116 | store. The handler then will return the internal path to store the |
---|
117 | file and possibly do additional things as well like validating the |
---|
118 | file or similar. |
---|
119 | |
---|
120 | Examples for such a file store handler can be found in the |
---|
121 | :mod:`waeup.sirp.applicants.applicant` module. Please see also the |
---|
122 | :class:`DefaultFileStoreHandler` class below for more details. |
---|
123 | |
---|
124 | The file store looks up handlers by utility lookups: it looks for a |
---|
125 | named utiliy providing |
---|
126 | :class:`waeup.sirp.interfaces.IFileStoreHandler` and named like the |
---|
127 | marker string (without leading/trailing underscores) in lower |
---|
128 | case. For example if the file id would be |
---|
129 | |
---|
130 | ``__IMG_USER__manfred.jpg`` |
---|
131 | |
---|
132 | then the looked up utility should be registered under name |
---|
133 | |
---|
134 | ``img_user`` |
---|
135 | |
---|
136 | and provide :class:`waeup.sirp.interfaces.IFileStoreHandler`. If no |
---|
137 | such utility can be found, a default handler is used instead |
---|
138 | (see :class:`DefaultFileStoreHandler`). |
---|
139 | |
---|
140 | **Context Adapters: Knowing Your Family** |
---|
141 | |
---|
142 | Often the internal filename or file id of a file depends on a |
---|
143 | context. For example when we store passport photographs of applicants, |
---|
144 | then each image belongs to a certain applicant instance. It is not |
---|
145 | difficult to maintain such a connection manually: Say every applicant |
---|
146 | had an id, then we could put this id into the filename as well and |
---|
147 | would build the filename to store/get the connected file by using that |
---|
148 | filename. You then would create filenames of a format like this:: |
---|
149 | |
---|
150 | __<MARKER-STRING>__applicant0001.jpg |
---|
151 | |
---|
152 | where ``applicant0001`` would tell exactly which applicant you can see |
---|
153 | on the photograph. You notice that the internal file id might have |
---|
154 | nothing to do with once uploaded filenames. The id above could have |
---|
155 | been uploaded with filename ``manfred.jpg`` but with the new file id |
---|
156 | we are able to find the file again later. |
---|
157 | |
---|
158 | Unfortunately it might soon get boring or cumbersome to retype this |
---|
159 | building of filenames for a certain type of context, especially if |
---|
160 | your filenames take more of the context into account than only a |
---|
161 | simple id. |
---|
162 | |
---|
163 | Therefore you can define filename building for a context as an adapter |
---|
164 | that then could be looked up by other components simply by doing |
---|
165 | something like: |
---|
166 | |
---|
167 | >>> from waeup.sirp.interfaces import IFileStoreNameChooser |
---|
168 | >>> file_id = IFileStoreNameChooser(my_context_obj) |
---|
169 | |
---|
170 | If you later want to change the way file ids are created from a |
---|
171 | certain context, you only have to change the adapter implementation |
---|
172 | accordingly. |
---|
173 | |
---|
174 | Note, that this is only a convenience component. You don't have to |
---|
175 | define context adapters but it makes things easier for others if you |
---|
176 | do, as you don't have to remember the exact file id creation method |
---|
177 | all the time and can change things quick and in only one location if |
---|
178 | you need to do so. |
---|
179 | |
---|
180 | Please see the :class:`FileStoreNameChooser` default implementation |
---|
181 | below for details. |
---|
182 | |
---|
183 | """ |
---|
184 | import grok |
---|
185 | import os |
---|
186 | import tempfile |
---|
187 | from hurry.file import HurryFile |
---|
188 | from hurry.file.interfaces import IFileRetrieval |
---|
189 | from zope.component import queryUtility |
---|
190 | from zope.interface import Interface |
---|
191 | from waeup.sirp.interfaces import ( |
---|
192 | IFileStoreNameChooser, IExtFileStore, IFileStoreHandler,) |
---|
193 | |
---|
194 | class FileStoreNameChooser(grok.Adapter): |
---|
195 | """Default file store name chooser. |
---|
196 | |
---|
197 | File store name choosers pick a file id, a string, for a certain |
---|
198 | context object. They are normally registered as adapters for a |
---|
199 | certain content type and know how to build the file id for this |
---|
200 | special type of context. |
---|
201 | |
---|
202 | Provides the :class:`waeup.sirp.interfaces.IFileStoreNameChooser` |
---|
203 | interface. |
---|
204 | |
---|
205 | This default file name chosser accepts almost every name as long |
---|
206 | as it is a string or unicode object. |
---|
207 | """ |
---|
208 | grok.context(Interface) |
---|
209 | grok.implements(IFileStoreNameChooser) |
---|
210 | |
---|
211 | def checkName(self, name): |
---|
212 | """Check whether an object name is valid. |
---|
213 | |
---|
214 | Raises a user error if the name is not valid. |
---|
215 | |
---|
216 | For the default file store name chooser any name is valid. |
---|
217 | """ |
---|
218 | if isinstance(name, basestring): |
---|
219 | return True |
---|
220 | return False |
---|
221 | |
---|
222 | def chooseName(self, name): |
---|
223 | """Choose a unique valid name for the object. |
---|
224 | |
---|
225 | The given name and object may be taken into account when |
---|
226 | choosing the name. |
---|
227 | |
---|
228 | chooseName is expected to always choose a valid name (that |
---|
229 | would pass the checkName test) and never raise an error. |
---|
230 | |
---|
231 | For this default name chooser we return the given name if it |
---|
232 | is valid or ``unknown_file`` else. |
---|
233 | """ |
---|
234 | if self.checkName(name): |
---|
235 | return name |
---|
236 | return u'unknown_file' |
---|
237 | |
---|
238 | class ExtFileStore(object): |
---|
239 | """External file store. |
---|
240 | |
---|
241 | External file stores are meant to store files 'externally' of the |
---|
242 | ZODB, i.e. in filesystem. |
---|
243 | |
---|
244 | Most important attribute of the external file store is the `root` |
---|
245 | path which gives the path to the location where files will be |
---|
246 | stored within. |
---|
247 | |
---|
248 | By default `root` is a ``'media/'`` directory in the root of the |
---|
249 | datacenter root of a site. |
---|
250 | |
---|
251 | The `root` attribute is 'read-only' because you normally don't |
---|
252 | want to change this path -- it is dynamic. That means, if you call |
---|
253 | the file store from 'within' a site, the root path will be located |
---|
254 | inside this site (a :class:`waeup.sirp.University` instance). If |
---|
255 | you call it from 'outside' a site some temporary dir (always the |
---|
256 | same during lifetime of the file store instance) will be used. The |
---|
257 | term 'temporary' tells what you can expect from this path |
---|
258 | persistence-wise. |
---|
259 | |
---|
260 | If you insist, you can pass a root path on initialization to the |
---|
261 | constructor but when calling from within a site afterwards, the |
---|
262 | site will override your setting for security measures. This way |
---|
263 | you can safely use one file store for different sites in a Zope |
---|
264 | instance simultanously and files from one site won't show up in |
---|
265 | another. |
---|
266 | |
---|
267 | An ExtFileStore instance is available as a global utility |
---|
268 | implementing :class:`waeup.sirp.interfaces.IExtFileStore`. |
---|
269 | |
---|
270 | To add and retrieve files from the storage, use the appropriate |
---|
271 | methods below. |
---|
272 | """ |
---|
273 | |
---|
274 | grok.implements(IExtFileStore) |
---|
275 | |
---|
276 | _root = None |
---|
277 | |
---|
278 | @property |
---|
279 | def root(self): |
---|
280 | """Root dir of this storage. |
---|
281 | |
---|
282 | The root dir is a readonly value determined dynamically. It |
---|
283 | holds media files for sites or other components. |
---|
284 | |
---|
285 | If a site is available we return a ``media/`` dir in the |
---|
286 | datacenter storage dir. |
---|
287 | |
---|
288 | Otherwise we create a temporary dir which will be remembered |
---|
289 | on next call. |
---|
290 | |
---|
291 | If a site exists and has a datacenter, it has always |
---|
292 | precedence over temporary dirs, also after a temporary |
---|
293 | directory was created. |
---|
294 | |
---|
295 | Please note that retrieving `root` is expensive. You might |
---|
296 | want to store a copy once retrieved in order to minimize the |
---|
297 | number of calls to `root`. |
---|
298 | |
---|
299 | """ |
---|
300 | site = grok.getSite() |
---|
301 | if site is not None: |
---|
302 | root = os.path.join(site['datacenter'].storage, 'media') |
---|
303 | return root |
---|
304 | if self._root is None: |
---|
305 | self._root = tempfile.mkdtemp() |
---|
306 | return self._root |
---|
307 | |
---|
308 | def __init__(self, root=None): |
---|
309 | self._root = root |
---|
310 | return |
---|
311 | |
---|
312 | def getFile(self, file_id): |
---|
313 | """Get a file stored under file ID `file_id`. |
---|
314 | |
---|
315 | Returns a file already opened for reading. |
---|
316 | |
---|
317 | If the file cannot be found ``None`` is returned. |
---|
318 | |
---|
319 | This methods takes into account registered handlers for any |
---|
320 | marker put into the file_id. |
---|
321 | |
---|
322 | .. seealso:: :class:`DefaultFileStoreHandler` |
---|
323 | """ |
---|
324 | marker, filename, base, ext = self.extractMarker(file_id) |
---|
325 | handler = queryUtility(IFileStoreHandler, name=marker, |
---|
326 | default=DefaultFileStoreHandler()) |
---|
327 | path = handler.pathFromFileID(self, self.root, file_id) |
---|
328 | if not os.path.exists(path): |
---|
329 | return None |
---|
330 | fd = open(path, 'rb') |
---|
331 | return fd |
---|
332 | |
---|
333 | def getFileByContext(self, context): |
---|
334 | """Get a file for given context. |
---|
335 | |
---|
336 | Returns a file already opened for reading. |
---|
337 | |
---|
338 | If the file cannot be found ``None`` is returned. |
---|
339 | |
---|
340 | This method takes into account registered handlers and file |
---|
341 | name choosers for context types. |
---|
342 | |
---|
343 | This is a convenience method that internally calls |
---|
344 | :meth:`getFile`. |
---|
345 | |
---|
346 | .. seealso:: :class:`FileStoreNameChooser`, |
---|
347 | :class:`DefaultFileStoreHandler`. |
---|
348 | """ |
---|
349 | file_id = IFileStoreNameChooser(context).chooseName() |
---|
350 | return self.getFile(file_id) |
---|
351 | |
---|
352 | def createFile(self, filename, f): |
---|
353 | """Store a file. |
---|
354 | """ |
---|
355 | file_id = filename |
---|
356 | root = self.root # Calls to self.root are expensive |
---|
357 | marker, filename, base, ext = self.extractMarker(file_id) |
---|
358 | handler = queryUtility(IFileStoreHandler, name=marker, |
---|
359 | default=DefaultFileStoreHandler()) |
---|
360 | f, path, file_obj = handler.createFile( |
---|
361 | self, root, file_id, filename, f) |
---|
362 | dirname = os.path.dirname(path) |
---|
363 | if not os.path.exists(dirname): |
---|
364 | os.makedirs(dirname, 0755) |
---|
365 | open(path, 'wb').write(f.read()) |
---|
366 | return file_obj |
---|
367 | |
---|
368 | def extractMarker(self, file_id): |
---|
369 | """split filename into marker, filename, basename, and extension. |
---|
370 | |
---|
371 | A marker is a leading part of a string of form |
---|
372 | ``__MARKERNAME__`` followed by the real filename. This way we |
---|
373 | can put markers into a filename to request special processing. |
---|
374 | |
---|
375 | Returns a quadruple |
---|
376 | |
---|
377 | ``(marker, filename, basename, extension)`` |
---|
378 | |
---|
379 | where ``marker`` is the marker in lowercase, filename is the |
---|
380 | complete trailing real filename, ``basename`` is the basename |
---|
381 | of the filename and ``extension`` the filename extension of |
---|
382 | the trailing filename. See examples below. |
---|
383 | |
---|
384 | Example: |
---|
385 | |
---|
386 | >>> extractMarker('__MaRkEr__sample.jpg') |
---|
387 | ('marker', 'sample.jpg', 'sample', '.jpg') |
---|
388 | |
---|
389 | If no marker is contained, we assume the whole string to be a |
---|
390 | real filename: |
---|
391 | |
---|
392 | >>> extractMarker('no-marker.txt') |
---|
393 | ('', 'no-marker.txt', 'no-marker', '.txt') |
---|
394 | |
---|
395 | Filenames without extension give an empty extension string: |
---|
396 | |
---|
397 | >>> extractMarker('no-marker') |
---|
398 | ('', 'no-marker', 'no-marker', '') |
---|
399 | |
---|
400 | """ |
---|
401 | if not isinstance(file_id, basestring) or not file_id: |
---|
402 | return ('', '', '', '') |
---|
403 | parts = file_id.split('__', 2) |
---|
404 | marker = '' |
---|
405 | if len(parts) == 3 and parts[0] == '': |
---|
406 | marker = parts[1].lower() |
---|
407 | file_id = parts[2] |
---|
408 | basename, ext = os.path.splitext(file_id) |
---|
409 | return (marker, file_id, basename, ext) |
---|
410 | |
---|
411 | grok.global_utility(ExtFileStore, provides=IExtFileStore) |
---|
412 | |
---|
413 | class DefaultStorage(ExtFileStore): |
---|
414 | """Default storage for files. |
---|
415 | |
---|
416 | Registered globally as utility for |
---|
417 | :class:`hurry.file.interfaces.IFileRetrieval`. |
---|
418 | """ |
---|
419 | grok.provides(IFileRetrieval) |
---|
420 | |
---|
421 | grok.global_utility(DefaultStorage, provides=IFileRetrieval) |
---|
422 | |
---|
423 | class DefaultFileStoreHandler(grok.GlobalUtility): |
---|
424 | """A default handler for external file store. |
---|
425 | |
---|
426 | This handler is the fallback called by external file stores when |
---|
427 | there is no or an unknown marker in the file id. |
---|
428 | |
---|
429 | Registered globally as utility for |
---|
430 | :class:`waeup.sirp.interfaces.IFileStoreHandler`. |
---|
431 | """ |
---|
432 | grok.implements(IFileStoreHandler) |
---|
433 | |
---|
434 | def pathFromFileID(self, store, root, file_id): |
---|
435 | """Return the root path of external file store appended by file id. |
---|
436 | """ |
---|
437 | return os.path.join(root, file_id) |
---|
438 | |
---|
439 | def createFile(self, store, root, filename, file_id, f): |
---|
440 | """Infos about what to store exactly and where. |
---|
441 | |
---|
442 | When a file should be handled by an external file storage, it |
---|
443 | looks up any handlers (like this one), passes runtime infos |
---|
444 | like the storage object, root path, filename, file_id, and the |
---|
445 | raw file object itself. |
---|
446 | |
---|
447 | The handler can then change the file, raise exceptions or |
---|
448 | whatever and return the result. |
---|
449 | |
---|
450 | This handler returns the input file as-is, a path returned by |
---|
451 | :meth:`pathFromFileID` and an instance of |
---|
452 | :class:`hurry.file.HurryFile` for further operations. |
---|
453 | |
---|
454 | Please note: although a handler has enough infos to store the |
---|
455 | file itself, it should leave that task to the calling file |
---|
456 | store. |
---|
457 | """ |
---|
458 | path = self.pathFromFileID(store, root, file_id) |
---|
459 | return f, path, HurryFile(filename, file_id) |
---|