1
|
Installation:
|
2
|
Install: make install
|
3
|
WARNING: This will delete the current public schema of your VegBIEN DB!
|
4
|
Uninstall: make uninstall
|
5
|
WARNING: This will delete your entire VegBIEN DB!
|
6
|
This includes all archived imports and staging tables.
|
7
|
|
8
|
Maintenance:
|
9
|
Important: Whenever you install a system update that affects PostgreSQL or
|
10
|
any of its dependencies, such as libc, you should restart the PostgreSQL
|
11
|
server. Otherwise, you may get strange errors like "the database system
|
12
|
is in recovery mode" which go away upon reimport.
|
13
|
|
14
|
Data import:
|
15
|
Import data into VegBIEN: . bin/import_all
|
16
|
Using column-based import: . bin/with_all 'import by_col=1'
|
17
|
Stop all running imports: . bin/stop_imports
|
18
|
Archive the last import: make schemas/rotate
|
19
|
Remove the last import: make schemas/public/reinstall
|
20
|
WARNING: This will delete the current public schema of your VegBIEN DB!
|
21
|
Re-import data: make schemas/rotate; . bin/import_all
|
22
|
Note: This will archive the last import.
|
23
|
|
24
|
Backups:
|
25
|
After a new import:
|
26
|
On vegbiendev:
|
27
|
./bin/make_analytical_db &
|
28
|
Wait for make_analytical_db to finish
|
29
|
make schemas/rotate
|
30
|
Rename the rotated schema using the date in the first datasource's log
|
31
|
file name
|
32
|
tail inputs/*/*/logs/<date>-*.log.sql
|
33
|
Check that every input's log ends in "Encountered 0 error(s)"
|
34
|
If many do not, fix the bug and discard the current (partial) import
|
35
|
Otherwise, continue
|
36
|
Delete previous imports so they won't bloat the full DB backup:
|
37
|
make backups/public.<datetime>.backup/remove
|
38
|
make backups/<schema>.backup/test & make backups/vegbien.backup/all &
|
39
|
On local machine:
|
40
|
make inputs/download-logs
|
41
|
make backups/download &
|
42
|
If desired, record the import times in inputs/import.stats.xls
|
43
|
Archived imports:
|
44
|
Back up: make backups/public.<date>.backup &
|
45
|
Note: To back up the last import, you must archive it first (above)
|
46
|
Test: make backups/public.<date>.backup/test &
|
47
|
Restore: make backups/public.<date>.backup/restore &
|
48
|
Remove: make backups/public.<date>.backup/remove
|
49
|
Download: make backups/download
|
50
|
Full DB:
|
51
|
Back up, test, and rotate: make backups/vegbien.backup/all &
|
52
|
Back up and rotate: make backups/vegbien.backup/rotate &
|
53
|
Test: make backups/vegbien.<date>.backup/test &
|
54
|
Restore: make backups/vegbien.<date>.backup/restore &
|
55
|
Download: make backups/download
|
56
|
Import logs:
|
57
|
Download: make inputs/download-logs
|
58
|
|
59
|
Datasource setup:
|
60
|
Add a new datasource: make inputs/<datasrc>/add
|
61
|
<datasrc> may not contain spaces, and should be abbreviated.
|
62
|
If the datasource is a herbarium, <datasrc> should be the herbarium code
|
63
|
as defined by the Index Herbariorum <http://sweetgum.nybg.org/ih/>
|
64
|
Install any MySQL export:
|
65
|
Create database in phpMyAdmin
|
66
|
mysql -p database <export.sql
|
67
|
Add input data for each table present in the datasource:
|
68
|
Choose a table name from <https://projects.nceas.ucsb.edu/nceas/projects
|
69
|
/bien/wiki/VegCSV#Suggested-table-names>, or use a custom name
|
70
|
Note that if this table will be joined together with another table, its
|
71
|
name must end in ".src"
|
72
|
make inputs/<datasrc>/<table>/add
|
73
|
Important: DO NOT just create an empty directory named <table>!
|
74
|
This command also creates necessary subdirs, such as logs/.
|
75
|
Place the CSV for the table in inputs/<datasrc>/<table>/
|
76
|
OR place a query joining other tables together in
|
77
|
inputs/<datasrc>/<table>/create.sql and svn add this file
|
78
|
Important: When exporting relational databases to CSVs, you MUST ensure
|
79
|
that embedded quotes are escaped by doubling them, *not* by
|
80
|
preceding them with a "\" as is the default in phpMyAdmin
|
81
|
If there are multiple part files for a table, and the header is repeated
|
82
|
in each part, make sure each header is EXACTLY the same.
|
83
|
(If the headers are not the same, the CSV concatenation script
|
84
|
assumes the part files don't have individual headers and treats the
|
85
|
subsequent headers as data rows.)
|
86
|
Add <table> to inputs/<datasrc>/import_order.txt before other tables
|
87
|
that depend on it
|
88
|
Auto-create the map spreadsheets:
|
89
|
make inputs/<datasrc>/; make inputs/<datasrc>/
|
90
|
Note: Must be run twice to properly bootstrap all maps.
|
91
|
svn add inputs/<datasrc>/*/{,.}{header,src,map*,VegBIEN}.csv{,.*}
|
92
|
Install the staging tables:
|
93
|
make inputs/<datasrc>/reinstall quiet=1 &
|
94
|
To view progress: tail -f inputs/<datasrc>/<table>/logs/install.log.sql
|
95
|
View the logs: tail -n +1 inputs/<datasrc>/*/logs/install.log.sql
|
96
|
tail provides a header line with the filename
|
97
|
+1 starts at the first line, to show the whole file
|
98
|
For every file with an error 'column "..." specified more than once':
|
99
|
Add a header override file "+header.<ext>" in <table>/:
|
100
|
Note: The leading "+" should sort it before the flat files.
|
101
|
"_" unfortunately sorts *after* capital letters in ASCII.
|
102
|
Create a text file containing the header line of the flat files
|
103
|
Add an ! at the beginning of the line
|
104
|
This signals cat_csv that this is a header override.
|
105
|
For empty names, use their 0-based column # (by convention)
|
106
|
For duplicate names, add a distinguishing suffix
|
107
|
For long names that collided, rename them to <= 63 chars long
|
108
|
Do NOT make readability changes in this step; that is what the
|
109
|
map spreadsheets (below) are for.
|
110
|
Save
|
111
|
svn add inputs/<datasrc>/<table>/<header_override>
|
112
|
If you made any changes, re-run the install command above
|
113
|
Map each table's columns:
|
114
|
In each <table>/ subdir, for each "via map" map.csv:
|
115
|
Open the map in a spreadsheet editor
|
116
|
Open the "core map" /mappings/Veg+-VegBIEN.csv
|
117
|
In each row of the via map, set the right column to a value from the
|
118
|
left column of the core map
|
119
|
Save
|
120
|
Regenerate the derived maps: make inputs/<datasrc>/
|
121
|
Accept the test cases:
|
122
|
make inputs/<datasrc>/test
|
123
|
When prompted to "Accept new test output", enter y and press ENTER
|
124
|
If you instead get errors, do one of the following for each one:
|
125
|
- If the error was due to a bug, fix it
|
126
|
- Add a SQL function that filters or transforms the invalid data
|
127
|
- Make an empty mapping for the columns that produced the error.
|
128
|
Put something in the Comments column of the map spreadsheet to
|
129
|
prevent the automatic mapper from auto-removing the mapping.
|
130
|
When accepting tests, it's helpful to use WinMerge
|
131
|
(see WinMerge setup below for configuration)
|
132
|
svn add inputs/<datasrc>/*/test.xml.ref
|
133
|
Commit: svn ci -m "Added inputs/<datasrc>/" inputs/<datasrc>/
|
134
|
Update vegbiendev:
|
135
|
On vegbiendev: svn up
|
136
|
On local machine: make inputs/upload
|
137
|
On vegbiendev:
|
138
|
Follow the steps under Install the staging tables above
|
139
|
make inputs/<datasrc>/test
|
140
|
|
141
|
Schema changes:
|
142
|
Regenerate schema from installed DB: make schemas/remake
|
143
|
Reinstall DB from schema: make schemas/reinstall
|
144
|
WARNING: This will delete the current public schema of your VegBIEN DB!
|
145
|
Reinstall errors tables: make inputs/install errors_table_only=1
|
146
|
Reinstall staging tables: . bin/reinstall_all
|
147
|
Sync ERD with vegbien.sql schema:
|
148
|
Run make schemas/vegbien.my.sql
|
149
|
Open schemas/vegbien.ERD.mwb in MySQLWorkbench
|
150
|
Go to File > Export > Synchronize With SQL CREATE Script...
|
151
|
For Input File, select schemas/vegbien.my.sql
|
152
|
Click Continue
|
153
|
Click in the changes list and press Ctrl+A or Apple+A to select all
|
154
|
Click Update Model
|
155
|
Click Continue
|
156
|
Note: The generated SQL script will be empty because we are syncing in
|
157
|
the opposite direction
|
158
|
Click Execute
|
159
|
Reposition any lines that have been reset
|
160
|
Add any new tables by dragging them from the Catalog in the left sidebar
|
161
|
to the diagram
|
162
|
Remove any deleted tables by right-clicking the table's diagram element,
|
163
|
selecting Delete '<table name>', and clicking Delete
|
164
|
Save
|
165
|
If desired, update the graphical ERD exports (see below)
|
166
|
Update graphical ERD exports:
|
167
|
Go to File > Export > Export as PNG...
|
168
|
Select schemas/vegbien.ERD.png and click Save
|
169
|
Go to File > Export > Export as SVG...
|
170
|
Select schemas/vegbien.ERD.svg and click Save
|
171
|
Go to File > Export > Export as Single Page PDF...
|
172
|
Select schemas/vegbien.ERD.1_pg.pdf and click Save
|
173
|
Go to File > Print...
|
174
|
In the lower left corner, click PDF > Save as PDF...
|
175
|
Set the Title and Author to ""
|
176
|
Select schemas/vegbien.ERD.pdf and click Save
|
177
|
|
178
|
Testing:
|
179
|
Mapping process: make test
|
180
|
Including column-based import: make test by_col=1
|
181
|
Map spreadsheet generation: make remake
|
182
|
Missing mappings: make missing_mappings
|
183
|
Everything (for most complete coverage): make test-all
|
184
|
|
185
|
WinMerge setup:
|
186
|
Install WinMerge from <http://winmerge.org/>
|
187
|
Open WinMerge
|
188
|
Go to Edit > Options and click Compare in the left sidebar
|
189
|
Enable "Moved block detection", as described at
|
190
|
<http://manual.winmerge.org/Configuration.html#d0e5892>.
|
191
|
Set Whitespace to Ignore change, as described at
|
192
|
<http://manual.winmerge.org/Configuration.html#d0e5758>.
|
193
|
|
194
|
Documentation:
|
195
|
To generate a Redmine-formatted list of steps for column-based import:
|
196
|
make inputs/QMOR/specimens/logs/steps.by_col.log.sql
|
197
|
|
198
|
General:
|
199
|
To see a program's description, read its top-of-file comment
|
200
|
To see a program's usage, run it without arguments
|
201
|
To remake a directory: make <dir>/remake
|
202
|
To remake a file: make <file>-remake
|