source: main/waeup.sirp/trunk/src/waeup/sirp/utils/helpers.txt @ 10839

Last change on this file since 10839 was 7496, checked in by uli, 13 years ago

Be more careful when testing directory removals: do that in a safe
location or the complete src/ tree might be erazed.

File size: 6.1 KB
RevLine 
[7321]1:mod:`waeup.sirp.utils.helpers` -- Helpers for SIRP
2***************************************************
[4189]3
[4920]4.. module:: waeup.sirp.utils.helpers
[4377]5
[7321]6Helper functions for SIRP.
[4377]7
[5140]8.. :doctest:
[4189]9
[7186]10:func:`remove_file_or_directory`
11================================
[4189]12
[7186]13.. function:: remove_file_or_directory(path)
[4189]14
[4377]15   Removes a file or directory given by a path. We can remove files:
[4189]16
[4377]17     >>> import os
[7496]18     >>> import tempfile
19     >>> old_location = os.getcwd()
20     >>> new_location = tempfile.mkdtemp()
21     >>> os.chdir(new_location)
22
[7186]23     >>> from waeup.sirp.utils.helpers import remove_file_or_directory
[4377]24     >>> open('blah', 'wb').write('nonsense')
25     >>> 'blah' in os.listdir('.')
26     True
[4189]27
[7186]28     >>> remove_file_or_directory('blah')
[4377]29     >>> 'blah' in os.listdir('.')
30     False
[4189]31
[4377]32   We can remove directories:
[4189]33
[4377]34     >>> os.mkdir('blah')
35     >>> 'blah' in os.listdir('.')
36     True
[4189]37
[7186]38     >>> remove_file_or_directory('blah')
[4377]39     >>> 'blah' in os.listdir('.')
40     False
[4189]41
42
[7186]43:func:`copy_filesystem_tree`
44============================
[4189]45
[7186]46.. function:: ccopy_filesystem_tree(src_path, dst_path[, overwrite=False[, del_old=False]])
[4189]47
[4377]48   Copies the contents of an (existing) directory to another
49   (existing) directory.
[4189]50
[4377]51   :param src_path: filesystem path to copy from
52   :type  src_path: string
53   :param dst_path: filesystem path to copy to
54   :type  dst_path: string
55   :keyword overwrite: Whether exiting files with same names should be
56                     overwritten.
57   :type  overwrite: bool
58   :keyword del_old: Whether old contents in destination path should be
59                   removed.
60   :type  del_old: bool
61   :return: List of non-copied files
62 
63   Both directories must exist.
[4189]64
[4377]65   Unix hidden files and directories (starting with '.') are not
66   processed by this function.
[4189]67
[4377]68   Without any further parameters, we can copy complete file trees:
[4189]69
[4377]70     >>> os.mkdir('src')
71     >>> os.mkdir('dst')
72     >>> open(os.path.join('src', 'blah'), 'wb').write('nonsense')
[4189]73
[7186]74     >>> from waeup.sirp.utils.helpers import copy_filesystem_tree
75     >>> result = copy_filesystem_tree('src', 'dst')
[4189]76
[4377]77   As a result we get a list of non-copied files:
[4189]78
[4377]79     >>> result
80     []
[4189]81
[4377]82   The created file was indeed copied:
[4189]83
[4377]84     >>> 'blah' in os.listdir('dst')
85     True
86
87   Hidden files (i.e. such starting with a dot) are not copied:
88
89     >>> open(os.path.join('src', '.blah'), 'wb').write('nonsense')
[7186]90     >>> result = copy_filesystem_tree('src', 'dst')
[4377]91     >>> '.blah' in os.listdir('dst')
92     False
93
94   This function supports some keyword parameters as explained below.
95
96Using ``overwrite``
97-------------------
98
[4189]99Boolean. If set to ``True``, any existing and same named files and
100directories in the destination dir are overwritten with copies from
101the source. Default is `False`.
102
103Normally, existing same named files in the destination are not
104overwritten:
105
106    >>> open(os.path.join('src', 'blah'), 'wb').write('newnonsense')
[7186]107    >>> result = copy_filesystem_tree('src', 'dst')
[4189]108    >>> open(os.path.join('dst', 'blah'), 'rb').read()
109    'nonsense'
110
111Instead the filename is added to the result (a list of non-copied
112files):
113
114    >>> result
115    ['blah']
116
117If, however, we use `overwrite`, the existing file will be
118overwritten:
119
[7186]120    >>> result = copy_filesystem_tree('src', 'dst', overwrite=True)
[4189]121    >>> open(os.path.join('dst', 'blah'), 'rb').read()
122    'newnonsense'
123
124    >>> result
125    []
126
127This also works for complete directories:
128
129    >>> os.mkdir(os.path.join('src', 'mydir'))
130    >>> os.mkdir(os.path.join('dst', 'mydir'))
131    >>> open(os.path.join(
132    ...   'src', 'mydir', 'blah'), 'wb').write('srcblah')
133    >>> open(os.path.join(
134    ...   'dst', 'mydir', 'blah'), 'wb').write('dstblah')
135
[7186]136    >>> result = copy_filesystem_tree('src', 'dst', overwrite=True)
[4189]137    >>> open(os.path.join('dst', 'mydir', 'blah'), 'rb').read()
138    'srcblah'
139
140
[4377]141Using ``del_old``
142-----------------
[4189]143
144Boolean. If set to ``True``, any copied files and directories will be
145removed from the src dir. Default is `False`.
146
[7186]147    >>> result = copy_filesystem_tree('src', 'dst', overwrite=True,
[4189]148    ...                                           del_old=True)
149    >>> os.listdir('src')
150    ['.blah']
151
152All files and directories are removed from src, except the hidden file
153we created in the beginning.
154
155
156Clean up:
157
[7496]158    >>> import shutil
159    >>> shutil.rmtree(new_location)
160    >>> os.chdir(old_location)
161   
[4376]162
[4377]163
[7186]164:func:`get_inner_HTML_part()`
165=============================
[4376]166
[7186]167.. function:: get_inner_HTML_part(html_code)
[4376]168
169   Get the 'inner' part out of a piece of HTML code.
170
171   Helper function mainly to extract 'real content' from already
172   rendered forms.
173
174   The term 'inner part' here means the ``<form>`` part of an HTML
175   snippet. If this cannot be found, we look for a ``<body>`` part and
176   if this cannot be found as well, we simply return the whole input.
177
178   If a ``<form>`` part can be found in an HTML snippet, this is
179   returned with all preceeding/following stuff stripped:
180
[7186]181     >>> from waeup.sirp.utils.helpers import get_inner_HTML_part
182     >>> print get_inner_HTML_part("""<html>
[4376]183     ... <head>
184     ... </head>
185     ... <BLANKLINE>
186     ... <body>
187     ... <form action="http://localhost/myuniversity/faculties/TF/add"
188     ...       method="post" class="edit-form"
189     ...       enctype="multipart/form-data">
190     ...   <h1>Add a department</h1>
191     ... </form>
192     ... </body>
193     ... </html>
194     ... """)
195     <BLANKLINE>
196     <form action="http://localhost/myuniversity/faculties/TF/add"
197           method="post" class="edit-form"
198           enctype="multipart/form-data">
199     <BLANKLINE>
200       <h1>Add a department</h1>
201     </form>
202     <BLANKLINE>
203     <BLANKLINE>
204
205   If there is no ``<form>`` part, try to find any ``<body>`` part:
206
[7186]207     >>> print get_inner_HTML_part("""<html>
[4376]208     ... <head>
209     ... </head>
210     ... <BLANKLINE>
211     ... <body>
212     ...  <div>Some content</div>
213     ... </body>
214     ... </html>
215     ... """)
216     <BLANKLINE>
217      <div>Some content</div>
218     <BLANKLINE>
219
220   If there is also no ``<body>`` tag, we return the input as-is:
221
[7186]222     >>> print get_inner_HTML_part("""<div>
[4376]223     ...  <div>Some content</div>
224     ... </div>
225     ... """)
226     <div>
227      <div>Some content</div>
228     </div>
Note: See TracBrowser for help on using the repository browser.