Project

General

Profile

« Previous | Next » 

Revision 11444

derived/biengeo/README.txt: moved commands to run to the top of the README. flagged commands-sections with *** and an identifying label.

View differences:

derived/biengeo/README.txt
1 1
BIEN geovalidation notes
2 2
========================
3 3

  
4
Dependencies:
4
***** install dependencies:
5 5
The only dependencies for running these scripts are PostgreSQL 9.1, postgis 2.0,
6 6
and unzip.
7 7
Installing these packages on Ubuntu 13.04 should be as simple as these commands:
......
12 12
sudo apt-get install postgresql-9.1-postgis-2.0
13 13
sudo apt-get install unzip
14 14

  
15
[Also see comments embedded in specific scripts in this directory.]
16

  
17
The bash and SQL statements contained in the files as ordered below
18
should be applied to carry out geographic name scrubbing and
19
geovalidation on a given corpus of BIEN location records.
20

  
21
That said, given the tight deadline under which this was done in order
22
to produced a geovalidated BIEN3 corpus in advance of the Nov 2013
23
working group meeting, and the corresponding manner in which much of
24
this was actually executed piecemeal in an iterative and interactive
25
fashion within a bash shell and psql session, I can't guarantee that the
26
code in its current state could be run end-to-end without intervention.
27
It's close, but probably not bulletproof.
28

  
15
***** initialize the DB:
29 16
1. geovalidate.sh
30 17
   - creates postgis DB and loads GADM2 data
31 18
2. geonames.sh
......
33 20
3. geonames-to-gadm.sql
34 21
   - contains SQL statements that build linkages between geonames.org
35 22
     names and GADM2 names
23

  
24
***** geoscrub new data:
36 25
4. load-geoscrub-input.sh
37 26
   - dumps geoscrub_input from vegbien and loads it into the geoscrub db
38 27
5. geonames.sql
......
42 31
   - contains (postgis-extended) SQL statements that score the validity
43 32
     of GADM2-scrubbed names against given point coordinates
44 33

  
34
[Also see comments embedded in specific scripts in this directory.]
35

  
36
The bash and SQL statements contained in the files as ordered below
37
should be applied to carry out geographic name scrubbing and
38
geovalidation on a given corpus of BIEN location records.
39

  
40
That said, given the tight deadline under which this was done in order
41
to produced a geovalidated BIEN3 corpus in advance of the Nov 2013
42
working group meeting, and the corresponding manner in which much of
43
this was actually executed piecemeal in an iterative and interactive
44
fashion within a bash shell and psql session, I can't guarantee that the
45
code in its current state could be run end-to-end without intervention.
46
It's close, but probably not bulletproof.
47

  
45 48
The resulting 'geoscrub' table is what contains the scrubbed (i.e.,
46 49
GADM2-matched) names and various geovalidation scores.
47 50

  

Also available in: Unified diff