Hi! I have 2 questions regarding Endeca that I hope to get help on!
1. If I am using a database (mySql) as my input to the datastore, is it possible to set up an automation process where the data store will automatically update data from the database? Right now what I am doing is to manually run the “LoadData” graph to upload/update information to the data store. I am looking for something which will allow the graph to be executed automatically on a daily basis.
2. I am having difficulty with creating a chart for carrying out analysis over time where I would like to see data varies monthly over a period. Within my data source, I have a metadata “Comment_Created_Date” and I have selected this metadata to be a “date” value when I create the metadata for this data source. After I have loaded this data source to my data source, “Comment_Created_Date” can be chosen as a Metric (date) in the chart configuration tool but it is not available as a Dimension. Due to this missing metadata, I am unable to create a chart to carry out analysis over time. Could there be a mistake in my data loading process?
1 - Typically, periodic reloads of the data are orchestrated using Integrator Server. Integrator Server allows you to schedule ingest graphs, monitor them, or execute them via HTTP. You can learn more about it here. http://docs.oracle.com/cd/E29805_01/integrator.230/DataIntegratorServer.pdf. If you're doing this locally for development purposes, Integrator Server may be overkill. You could use your local O/S scheduler to call a .bat file one per day. Instructions on running graphs from command line can be found here: http://forum.cloveretl.com/viewtopic.php?f=4&t=4986
2 - The Endeca Server does not intuit month from your date on ingest. During your ingest graph, it is recommended that you use a design pattern like this https://wikis.oracle.com/display/endecainformationdiscovery/Derived+Datetime-based+Attribution to derive additional time-based attribution that can be used in your time-based analysis.
The other issue you might be having with #2 is that DateTime attributes are not flagged as a "Dimension" by default in your Base View. And, since you cannot modify the Base View, you will need to create another View to use for your Chart, which has your DateTime attribute (or Month attribute that Dan suggested you create) flagged as a "Dimension". That "Dimension" flag is what controls what you are allowed to Group By in your chart.
If you aren't familiar with Views, I suggest you watch the Parts 4.2 and 4.3 of the "Getting Started with EID" screencast series: http://www.oracle.com/technetwork/middleware/endeca/learnmore/index.html
Thanks Dan and Carrie, I am now able to solve the 2 problems that I have mentioned!
Now that I am able to display my data over time, I have encountered another problem. For my data over the periods of June, July and August, I have some records tagged to the month of June and August respectively but no records tagged to the month of July. In my chart, the graph shows a value of non-zero between June and August although I believe that the chart should display a value of 0 in the metrics axis (vertical). Does anyone have a good way to solve this?
For more advanced implementations this record type can have benefit during ETL (offer additional date attribution that can be joined on to records) but, if written to the Endeca DataStore as its own record type can provide value in other ways (namely your issue):
Now when I have an analytic that is powered by an EQL statement that groups by MONTH, I now can offer a "July bucket" with nothing contained in it.
RETURN foo AS
GROUP BY Month
This statement can execute across both the main and the calendar record type. The calendar record type doesn't carry the p_sales_amt property, obviously, but can offer up the July buckets to fill in any holes in your temporal analysis.
Are you able to explain more regarding the solution? I do not quite understand what you mean by having a calender record type.
Also in regards to your EQL statement, may I ask where should I insert this EQL statement that you have mentioned and also, since you are grouping by "Month", do I need to define what "Month" means since I believe it is not defined within the system?
Record types are a widely used, but seldom documented, approach to Endeca data modeling. Since the Endeca Data Store / MDEX Engine is simply a common record store with no semblance of tables, developers often tag records that carry widely different attribution as different record types. The engine itself knows nothing about the different classification of records that the data developer introduced, it just sees records with differing attribution.
To introduce the second "calendar" record type, do the following:
1) Add a new reader, reformatter, and bulk DataStore writer to your ingest .grf on a new "phase".
2) Read in your calendar records from a database or flat file.
3) In the reformatter, add a new field called "RecordType" with a value of "calendar". You may want to tag your other, previously existing records with a record type too (e.g. RecordType:Sales). You'll also need to determine what makes your new calendar record unique since they bulk writer will expect you to designate which attribute defines uniqueness for the record, aka. the RecSpec.
4) Write the records to the MDEX.
In the first post, I suggested that you leverage a design pattern to introduce new time-based attribution to your records....I assumed one of these new attributes would be called "Month" (which I used in my previous post in the example GROUP BY statement) but you may have named it differently which is fine. The new calendar records' attributes will need to have the same names as the time-based attributes you added in the first post. The EQL I supplied was just meant to serve as an example. I would assume you would use an EQL statement like the one I provided to power a chart. Obviously, your attribute names on your two record types are going to differ, as mine was just an example.