Email input block: Extracting email and attachment content

Modified on Fri, 13 Nov 2020 at 11:20 AM

If your data is arriving on a regular basis by email and your job is to download, clean and analyse it, you'll be pleased to know that there is a better way to deal with this scenario.

The Email source block allows you to import email data from your online email account (eg. Gmail, Hotmail, Yahoo Mail). It supports advanced filtering options and can also download attachments to a chosen folder.

This guide describes the options used to configure the block and the best practices to ensure optimal performance.


Connection options

The Connection options specify how Omniscope can connect and communicate with your email server.

These options will be different for each server, and can typically be found via an online search, or by referencing your server documentation. Some email servers require you to enable access before you can connect, for example if you are using Gmail and two-factor authentication you will need to generate an app-specific password first. In other cases you may need to enable external access and/or IMAP/POP-3.


Protocol

Omniscope supports both the IMAP and POP-3 protocols.

IMAP stands for Internet Message Access Protocol. It is a fast, modern email protocol.

POP-3 stands for Post Office Protocol. It is an older, slower protocol.


We recommend using IMAP wherever possible for the following reasons:

    IMAP is typically much faster than POP-3. IMAP allows Omniscope to request messages in batches. When using POP-3 Omniscope can only request messages one at a time.

    When using IMAP you can select one or more folders, so you can choose to download messages from your Inbox or Sent folder or any custom folder you have setup. If you are using POP-3 you can only download messages from your Inbox.

    IMAP allows filters to be executed and managed by your email server. When using POP-3 email server filtering is not supported; instead Omniscope must download every message in turn and check it against the supplied filter.


URL/Port

The URL/Port define the endpoint of your email server. If your server supports both IMAP and POP-3 it will have different endpoints for each protocol.

Use SSL

SSL stands for Secured Socket Layer and is used to encrypt your email data during communication between your server and Omniscope. Most email servers should support SSL.


Username/Password

You need to enter your username and password in order for your email server to authenticate you. In some cases this may not be the password you use to login to your account; you may need to generate a separate app-specific password.


Connection examples

The following are connection details for some common email servers. They were correct at the time this guide was written, but you should always refer to the provider's online documentation to ensure your details are up-to-date.

As mentioned previously - you should always use IMAP, where that option exists, and it is appropriate to do so.

Gmail

If you are using Gmail and two-factor authentication, the password you use to login to your GMail account will not work. Instead you need to generate an app-specific password. Instructions can be found here.


Gmail (IMAP)

Mail server protocol:   IMAP
Mail server URL: imap.gmail.com
Mail server port: 993
Use SSL encryption:     true    

Gmail (POP3)

Mail server protocol:   POP3
Mail server URL: imap.gmail.com
Mail server port: 995
Use SSL encryption:     true    

Hotmail (IMAP)

Mail server protocol:   IMAP
Mail server URL: imap-mail.outlook.com
Mail server port: 993
Use SSL encryption:     true    

Yahoo mail (IMAP)

Mail server protocol:   IMAP
Mail server URL: imap.mail.yahoo.com
Mail server port: 993
Use SSL encryption:     true


Data options

The data options allow you to configure the structure of the data in your Email block.


Fields

The Fields drop-down allows you to pick which fields you want to include in your data. When you first configure the block the selected fields include: Subject, Sender, Recpients and Date. There are several unselected fields in the default configuration: Content Text, Content HTML, Attachment names and Downloaded attachment names. We refer to these as Content fields, as they retrieve message content data. Only pick these fields if you need to download this data, as extracting content data will mean the block will take longer to execute. Omniscope can read 1000 messages without the Content fields selected in only a few seconds using IMAP, but this may increase to a minute if you choose one or more Content fields.

Folders

The folders drop-down lets you choose which folders you want to include when reading your messages. You should select at least one folder. If you are using IMAP you should see a list of all the folders you see when you access your online mail account. This will include the Inbox and Send folders, as well as any account-specific and custom folders you have created. When using POP-3 you will only be able to select the Inbox.


Max messages

The max messages field allows you to specify the maximum number of messages that will be read AFTER any filters have been applied. Messages are typically served in date order, with the most recent being first, so if you set this to 10 then you should see the 10 most recent messages.


Filter options

These options allow you to filter messages by: From Address, Recipient Address, Date, Subject and/or Content Body. Each option is presented as a text field. In the text field you can enter part or all of the values you want to filter on. Leave the value empty if you do not want to apply a filter. 

Filters are case insensitive. If you specify more than one filter they will be combined together (AND'd). For example, if you were to configure the following filters:


From address: sarah
Subject: holidays

You may see the following results:

From address: sarah@hotmail.com
Subject: Looking forward to holidays!!!

From address: chris@sarahcorp.com
Subject: HOLIDAYS PLAN


Note - Filters perform much better when using IMAP.


Attachment options

Configure the attachment options if you want to download message attachments when reading message data. You must tick Download attachments in order to enable this.


Attachment filename filter

The filename filter allows you to configure which attachments will be downloaded. The filter value is a wildcard search string (case insensitive). You can use * to represent multiple characters and ? to represent a single character. By default the filter value is set to *.*, meaning all attachments should be downloaded.

Attachment folder

Select the folder on the Omniscope server you want to download attachments to. Only folders on the Omniscope local file system are currently supported.

Overwrite existing files

If Overwrite existing files is selected, any attachment file downloaded will replace any existing file in your attachment folder if the filename is the same. If you untick this option, and the file already exists, Omniscope will first check to see whether the file content is different. If it is different it will be assigned a new unique name, preserving the existing file in place.

Saving the block settings

If you wish to use the email block in multiple files you can use a password parameter to populate the app password box and bookmark the whole email block to save the configuration. 

If the password is to change at some point - you could simply edit the parameter - that step will update all email blocks in different files remotely. 

(you will need to execute again for the new password to be applied)


Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article