Import an excel file into ms access
Integrating Talend with Excel data
files can be particularly challenging, in this we are going to walk you through
the basic steps required to get started on build a job to load a MS Excel file
into an Access database.
You can download the example excel
file used in this tutorial from here:
Lets assume we are trying to load
data from an excel file that contains the following fields:
- Salesman Name
- Salesman Id
- Account Id
- Number of Units
The file would look something like
this:
We want to load this file into a MS
Access database table with a schema as follows:
Our Talend job will need to load
only the fields as required in the database table, here are the steps we will
take to build our Talend Open Studio solution:
Step 1: Open Talend
Open Talend and create or open an
existing project
Step 2: Create a new job
- Right click on Job Designs in the Repository window and select “Create job”
- Name the job “excel_load_example”
Step 3: Create a Ms Excel file
repository element
Now we need to create a repository
item for our excel file. To do this click the arrow next to “Metadata” in
the Repository window and right click on “File Excel” and select “Create file
Excel”.
Enter a name for your example file
schema and click next:
Select your example file that we saw
earlier (or use your own) by clicking the “Browse” button.
Then select which work sheets in the
Excel file contain the data we want to use for our schema.
Click Next
On the next screen select the
checkbox for “Set heading row as column names”, then click “Refresh Preview”.
Click Next.
Talend has now generated an
estimated schema, review this schema and make any changes as you would like,
then click Finish.
Step 4: Create an Access Database
Table Repository Element
In this next step we are going to
create a repository element for the Access database table schema so that we can
use this in our job design.
Right click on “DB Connections”
under “Metadata” in the Repository window and select “Create connection”
Name your database then click next:
Select DB Type as “Access” and DB
Version as “Access 2007″ or “Access 2003″ if you are using that version.
Click on the “Browse” button and
navigate to your Access database file.
Once selected then click on the
“Check” button to make sure that Talend is able to connect to your DB.
If the connection works then click
“Finish”
Now the DB Connection should appear
under “Repository” -> “Metadata” -> “Db Connections” in the panel on the
left
Right click on the DB name in this
panel and and select “Retrieve schema”
On the pop up screen, click “Next”
On the following screen, select the
“sales” table from our database and click “Next”
The final window will display the
schema of your database table, which can now be used in the repository for
future job design. Review the schema and make any changes if needed, then click
“Finish”.
Step 5: Design your job
In this step we are are going to
design our job to connect our MS Excel file to our MS Access DB.
Open the job we created in Step 2 by
double clicking the name of the job under Job Designs in the Repository window.
In the palette window on the right
hand side type “fileinputexcel” into the search box, then drag and drop the
component “tFileInputExcel” into the job window in the center of the screen.
Select the tFileInputExcel Component
in the job window and then select the “Component” tab near the bottom middle of
the screen.
Click the drop down box next to
“Property type” and select “Repository”
Then click the button with three
dots that appears and select the Excel file from the repository that we created
in Step 3.
If the file is in Excel 2007 format
then click the check box “Read excel 2007 file format (xlsx)”
Select a tMap component from the
palette on the right hand side of the screen and drag it into the job design
window
Right click on the tFileInputExcel
component, select Row->Main and connect a row to the tMap component.
Select a tAccessOutput component
from the Palette and drag it over to the job design window.
Click on the tAccessOutput component
to select it, then navigate to the Component tab in the lower middle of the
screen
Under Property Type, select
Repository and then select the access database from the repository that we
created in Step 4
Under Schema, select Repository and
then select the “sales” table from our database in the repository
Next right click the tMap component,
Select “Row” then “New Output”.
Connect this new output to the
tAccessOutput component
Name the output row- “output1″
A popup window will appear asking
you: “Do you want to get the schema of the target component” – click “Yes”.
Now your tMap is connected to the
tAccessOutput component, double click the tMap component to open up the
tMap editor.
Click and drag the data fields from
the “row1″ panel to the left of the screen to the corresponding “output1″ data
fields. This tMap editor enables you to map to fields from the Excel input to
the correct output fields in Access.
Click “Apply” then “OK” to save the
changes and return to the Job Design window
Now your job design is complete and
we just need to run the job to load the file into Access
Step 6: Run your Job
In the group of tabs in the lower
middle of the screen, select the “Run” tab
Under Execution, click the “Run”
button
Your job will load and then run and
you will see how many rows were processed in the job design window.
Now loaded your data into your
Access DB table.
I think this is a very useful aspect of complex tableau operations and also useful to provide so many solutions to tough and complicated problems.
ReplyDeleteTableau Soap Connection