The following is a collection of best practice suggestions to consider when building jobs.
- Create new attributes – don’t overwrite your original data: When manipulating data in Openprise, create new attributes for the changed data. That way, you can easily see the original data and see how your tasks have changed the data. Then when writing data back to a data target, you’ll map the new attribute to the original attribute in your data target.
- Establish standard naming conventions for new attributes:
One convention is to preface new attributes with OP or 0P, so you always know they’re generated within Openprise. Then append a “modifier” after the original field name with what process populated the attribute’s data.
This naming convention has the added benefit of grouping all the OP fields together, making it easy to locate and select the attributes when writing to your data target.
For example, given an attribute in the original input data source called “Country”, you could have the following OP attributes created in a job:
- OP Country clean – ie. ran through a task to remove junk values
- OP Country infer – ie. inferred the value using a reference data source
- OP Country norm – ie. normalized the value using a reference data source
- OP Country final - ie. the final value to push back to your system
- Establish naming conventions for related jobs: If you have jobs that will eventually be part of the same bot, name them so it is clear that they’re related. If there are several related jobs, adding numbers to indicate order is also helpful. For example, it is easy to see that the following jobs are related, and in what order they are intended to run:
- List Import 01 – Clean and dedupe the list
- List Import 02 – Dedupe list against Marketo
- List Import 03 – Export net new records to Marketo
- Use the Openprise tag feature: Consistent use of tags helps organize related jobs, bots, data sources and data targets.
- Start small, test often: When building a new job, start with a few tasks, run the job, and view the output data in Openprise. When you’re satisfied the task is manipulating the data correctly, add additional tasks.
- Start by working with a subset of your data, especially if you have a very large input data set: You can create your job using a “Filter Records” task as the first task, and include a filter to limit the total number of records processed. For example, you can filter on email begins with "a". While testing, this will allow the tasks to run quickly. Later, when you’re happy with your job logic, you can remove the Filter Records task to process the entire data set.
- Separate Processing jobs from Exporting jobs: Although it may seem simpler to have an export task at the end of a cleaning job, doing so means you can’t repeatedly test the cleaning tasks without always writing to your data target. Instead, create a second job to perform the export. That way, you can repeatedly run the Cleaning job and only Export when you have confidence in your cleaning logic.
- Keep jobs relatively short: This allows for easier modification in the future without danger of bumping up against the maximum number of tasks in a job. A good rule of thumb is to keep jobs less than 25 tasks. Note that Openprise allows up to 30 tasks per job.
- Make liberal use of the description field for tasks: Although it involves a bit of additional typing, those comments you enter today will pay off when you revisit your job tasks in a few months and have the comment to help you figure out what the task is supposed to accomplish. Or even better, those comments will be invaluable for the person who inherits your jobs when you are promoted.
- When working with Salesforce Leads, filter on IsConverted=No, so you don't process converted leads unless you want to. When an Openprise data source contains Salesforce leads, both converted and non-converted leads are imported from Salesforce. Often adding a filter that excludes converted leads at the beginning of your lead processing job and unchecking the box "Copy unselected data from the input Data Source to the output Data Source without modification" is all you'll need. As an alternative, you can periodically filter-purge the IsConverted=Yes leads from your data source.
- Once a job has been developed and is producing the desired results, do either or both of the following to gain performance improvements:
- Remove any unused job attributes. If you have many unused attributes in your job, Openprise is tracking, indexing and managing those attributes unnecessarily. To remove the unused attributes, go to Edit Tasks > Job Attributes > Remove All Unused Attributes.
- Run the job in production mode. Production mode eliminates the creation and management of each task's output data store, and instead only creates and manages the output from the last (terminating) task, producing performance efficiencies. However, if any other job uses the output from an intermediate task, you cannot run the job in production mode.
- Create a new data target for each job "function": For your various job functions, it is helpful to create separate data targets for each main function. For example you could create a separate data target for list loading, continuous lead cleansing, and lead scoring. By creating separate data targets for each function you can 1) control exactly which attributes will be used to write back to your system, 2) with the help of Openprise support, you can track precisely how active each data target is, and see if a particular data target is updating more data than expected.
- Include filters when writing data back so you only update those values that need updating. For example, if you have an automated lead scoring job that runs nightly and calculates the lead score in an attribute named OP Lead Score and updates the Salesforce Lead Score field, you'll only want to update the lead score if it has changed. To ensure that you aren't updating the lead score with the same value, do the following:
- make sure the Lead Score field in your target system is pulled into Openprise using your Data Source,
- add a filter task just before the Export task and check that the OP Lead Score has value, and that OP Lead Score is not the same as Lead Score. This will eliminate unneeded updates and result in faster job runs.