Discussion:
[go-nuts] Universal func for export data from XML to DB
Pavel Pavlenko
2015-07-30 07:15:31 UTC
Permalink
Hi everyone! I need your help.

Task: I have 15 different XML-files & need to export data from them to 15
tables in Postgres.
For now I have 15 different Export functions.
In FOR cycle I read elements from XML & insert to DB.
Differences between this 15 funcs only in table fields and parameters.

Ok, I can pass table fields to this func in slice. But values for DB insert
generated within this func & for different xml-files the number of params
is different
https://github.com/pavlik/fias_xml2postgresql/blob/bcc406d246b6225c61786bb573b7089b68fb500f/structures/address_object/address_object_bulk.go#L185

I'm at an impasse. How to optimize 15 funcs to one?
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Konstantin Khomoutov
2015-07-30 17:28:35 UTC
Permalink
On Thu, 30 Jul 2015 00:15:31 -0700 (PDT)
Post by Pavel Pavlenko
Task: I have 15 different XML-files & need to export data from them
to 15 tables in Postgres.
For now I have 15 different Export functions.
In FOR cycle I read elements from XML & insert to DB.
Differences between this 15 funcs only in table fields and parameters.
Ok, I can pass table fields to this func in slice. But values for DB
insert generated within this func & for different xml-files the
number of params is different
https://github.com/pavlik/fias_xml2postgresql/blob/bcc406d246b6225c61786bb573b7089b68fb500f/structures/address_object/address_object_bulk.go#L185
I'm at an impasse. How to optimize 15 funcs to one?
Basically there are two approaches to such a problem:

1) Reflection (happens at runtime).
2) Code generation (happens before compile time).

With reflection, you arm the necessary types with additional data
your reflection code is ready to interpret and then at runtime you
inspect instances of those types with your reflection code to extract
additional data from them, which you then use to decide what to do next.

This sounds complicated, but in fact it's not *that* complicated.
You already rely on reflection to parse your XML data by using struct
tags on appropriate types:

| // Классификатор адресообразующих элементов
| type XmlObject struct {
| XMLName xml.Name `xml:"Object"`
| AOGUID string `xml:"AOGUID,attr"`
| FORMALNAME string `xml:"FORMALNAME,attr"`
| REGIONCODE int `xml:"REGIONCODE,attr"`
...

The code in the encoding/xml package, given an instance of such
XmlObject type, uses reflection to read the struct tag [1] from each of the
fields of that instance, looks for those tags which contain the "xml:"
prefix and extracts the information following it to know how to map the
data in the XML stream it processes to those fields.

Hence one approach is to employ the same idea by using your own struct
tags, say, like this:

| // Классификатор адресообразующих элементов
| type XmlObject struct {
| XMLName xml.Name `xml:"Object"`
| AOGUID string `xml:"AOGUID,attr" fias:"ao_guid,UUID NOT NULL"`
| FORMALNAME string `xml:"FORMALNAME,attr" fias:"fn,VARCHAR(120) NOT NULL"`
| REGIONCODE int `xml:"REGIONCODE,attr" fias:"rc,INT NOT NULL"`
...

Now in your processing code you reflect over an instance of this XmlObject,
iterate over its fields, extract the struct tag from each, find the "fias" key,
read the value and use it to decide what to do with the field.
I have explicitly mentioned SQL DDL bits there hinting at that you could use
this information to generate DDL statement right off your types.
The ordering of values for the corresponding INSERT SQL statements could simply
be the ordering of fields in the type (as seen when reflecting over it).

A working example of reflection is in the encoding/xml and encoding/json standard
packages.

Code generation, on the other hand, means that you *somehow* describe your
data and how it should be mapped "in" (from XML) and "out" (into RDBMS engine),
and submit those description to a program which spews appropriate Go code,
which is then compiled and run. With this approach there will be 15 exporting
functions but that's not the problem because they will be auto-generated,
and could be regenerated each time data descriptions change.

You can use whatever you wish for describing your data. Say, JSON, XML or
annotated structures in Go source files (Go has standard packages for parsing Go
code and getting hold onto anything a compiler will see when reading it, including
comments and tags on struct fields).

To generate the code, you either use "true" code generating via building an AST
of your would-be source code (using the already mentioned standard packages),
and then obtain the source text from them, or use simple templating via the
template/text standard package or other means.
You might find [2, 3] interesting.

1. https://golang.org/pkg/reflect/#StructTag
2. https://clipperhouse.github.io/gen/
3. http://blog.golang.org/generate
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...