Accern API

Authentication

curl "https://feed.accern.com/v3/alphas?token=TOKEN"

require 'uri'
require 'net/http'

url = URI("https://feed.accern.com/v3/alphas?token=TOKEN")
http = Net::HTTP.new(url.host, url.port)
request = Net::HTTP::Get.new(url)
response = http.request(request)
puts response.read_body

import requests

url = "https://feed.accern.com/v3/alphas?token=TOKEN"
req = requests.get(url)
text_response = req.text # read response as Text
print(text_response)

json_response = req.json() # read response as JSON
print(json_response)

if (!require("jsonlite")) install.packages("jsonlite")

url <- "https://feed.accern.com/v3/alphas?token=TOKEN"
response <- fromJSON(url)
print(response)

Make sure to replace TOKEN above

To authenticate provide your authentication token in the url. We provide the authentication token in your welcome email.

Feed

curl "https://feed.accern.com/v3/alphas?token=TOKEN"

url = URI("https://feed.accern.com/v3/alphas?token=TOKEN")

url = "https://feed.accern.com/v3/alphas?token=TOKEN"

url <- "https://feed.accern.com/v3/alphas?token=TOKEN"

The above url returns the most recent 100 documents.

[
  {
    "id": 1774184,
    "article_id": {
      "$oid": "589b44a569fe9f7f77024f4a"
    },
    "article_sentiment": 0.098,
    "article_traffic": null,
    "article_type": "blog",
    "article_url": "http://feedproxy.google.com/~r/RedmondPie/~3/mF7K04DF1y4/",
    "author_id": null,
    "correlations": null,
    "entities": [
      {
        "name": "Apple Inc.",
        "type": "Public",
        "index": "S&P 500, Russell 1000, Russell 3000, Wilshire 5000, BARRON'S 400, NASDAQ 100",
        "region": "North America",
        "sector": "Technology",
        "ticker": "AAPL",
        "country": "United States",
        "exchange": "NASDAQ",
        "industry": "Computer Manufacturing",
        "entity_id": "EQ0010169500001000",
        "global_id": "BBG000B9XRY4",
        "competitors": [
          "GOOG",
          "HPQ"
        ]
      },
      {
        "name": "Amazon.com, Inc.",
        "type": "Public",
        "index": "S&P 500, Russell 1000, Russell 3000, Wilshire 5000, NASDAQ 100",
        "region": "North America",
        "sector": "Consumer Services",
        "ticker": "AMZN",
        "country": "United States",
        "exchange": "NASDAQ",
        "industry": "Catalog/Specialty Distribution",
        "entity_id": "EQ0021695200001000",
        "global_id": "BBG000BVPV84",
        "competitors": [
          "AAPL",
          "BKS"
        ]
      }
    ],
    "event_author_rank": [
      {
        "author_rank": 4,
        "event_group": "Employment Actions"
      },
      {
        "author_rank": 4,
        "event_group": "Employment Actions"
      }
    ],
    "event_groups": [
      {
        "type": "Recruitment",
        "group": "Employment Actions"
      },
      {
        "type": "Layoff",
        "group": "Employment Actions"
      }
    ],
    "event_impact_score": {
      "overall": 40.88471673254282,
      "on_entities": [
        {
          "entity": "AAPL",
          "on_entity": 31
        },
        {
          "entity": "AMZN",
          "on_entity": 32
        }
      ]
    },
    "event_source_rank": [
      {
        "event_group": "Employment Actions",
        "source_rank": 6
      },
      {
        "event_group": "Employment Actions",
        "source_rank": 6
      }
    ],
    "event_summary": {
      "group": "",
      "theme": "",
      "topic": "",
      "action": "",
      "sub-theme": "",
      "acting_party": ""
    },
    "first_mention": false,
    "harvested_at": "2017-02-08 16:17:39 UTC",
    "overall_author_rank": 5,
    "overall_source_rank": 6,
    "source_id": null,
    "story_id": {
      "$oid": "589a6b6469fe9f7f70ac1df6"
    },
    "story_saturation": "high",
    "story_sentiment": 0.072,
    "story_shares": null,
    "story_volume": 58
  }
]

GET https://feed.accern.com/v3/alphas?token=TOKEN

By default this request will return the most recent 100 documents.

Filtering

Filter by last_id

curl "https://feed.accern.com/v3/alphas?last_id=1774184&token=TOKEN"

url = URI("https://feed.accern.com/v3/alphas?last_id=1774184&token=TOKEN")

url = "https://feed.accern.com/v3/alphas?last_id=1774184&token=TOKEN"

url <- "https://feed.accern.com/v3/alphas?last_id=1774184&token=TOKEN"

Filter by index

curl "https://feed.accern.com/v3/alphas?index=sp500&token=TOKEN"

url = URI("https://feed.accern.com/v3/alphas?index=sp500&token=TOKEN")

url = "https://feed.accern.com/v3/alphas?index=sp500&token=TOKEN"

url <- "https://feed.accern.com/v3/alphas?index=sp500&token=TOKEN"

Filter by multiple indexes

curl "https://feed.accern.com/v3/alphas?index=sp500,dow30&token=TOKEN"

url = URI("https://feed.accern.com/v3/alphas?index=sp500,dow30&token=TOKEN")

url = "https://feed.accern.com/v3/alphas?index=sp500,dow30&token=TOKEN"

url <- "https://feed.accern.com/v3/alphas?index=sp500,dow30&token=TOKEN"

Filter by ticker

curl "https://feed.accern.com/v3/alphas?ticker=amzn&token=TOKEN"

url = URI("https://feed.accern.com/v3/alphas?ticker=amzn&token=TOKEN")

url = "https://feed.accern.com/v3/alphas?ticker=amzn&token=TOKEN"

url <- "https://feed.accern.com/v3/alphas?ticker=amzn&token=TOKEN"

Filter by multiple tickers

curl "https://feed.accern.com/v3/alphas?ticker=aapl,amzn&token=TOKEN"

url = URI("https://feed.accern.com/v3/alphas?ticker=aapl,amzn&token=TOKEN")

url = "https://feed.accern.com/v3/alphas?ticker=aapl,amzn&token=TOKEN"

url <- "https://feed.accern.com/v3/alphas?ticker=aapl,amzn&token=TOKEN"

Parameter	Description
last_id	Returns the latest 100 documents that came after the provided id. Used to prevent duplicates while keeping in sync (see streaming section).
index	Filters documents by the index, see below table for supported indexes. To filter by multiple indexes pass a comma separated list of indexes.
ticker	Filters documents by ticker. To filter by multiple tickers pass a comma separated list of tickers.

Allowed index values

index	expected query string value
S&P 500	sp500
Russell 1000	russell1000
Russell 3000	russell3000
Wilshire 5000	wilshire5000
Barron’s 400	barrons400
DOW 30	dow30

File Format

curl "https://feed.accern.com/v3/alphas.csv?token=TOKEN"

url = URI("https://feed.accern.com/v3/alphas.csv?token=TOKEN")

url = "https://feed.accern.com/v3/alphas.csv?token=TOKEN"

url <- "https://feed.accern.com/v3/alphas.csv?token=TOKEN"

By default the response from the API feed is in JSON format. But if you append .csv you will get the data in CSV format.

CSV Columns
id
article_id
story_id
harvested_at
entities_name_1
entities_ticker_1
entities_global_id_1
entities_entity_id_1
entities_type_1
entities_exchange_1
entities_sector_1
entities_industry_1
entities_country_1
entities_region_1
entities_index_1
entities_competitors_1
entities_name_2
entities_ticker_2
entities_global_id_2
entities_entity_id_2
entities_type_2
entities_exchange_2
entities_sector_2
entities_industry_2
entities_country_2
entities_region_2
entities_index_2
entities_competitors_2
event_groups_group_1
event_groups_type_1
event_groups_group_2
event_groups_type_2
story_sentiment
story_saturation
story_volume
first_mention
article_type
article_sentiment
overall_source_rank
event_source_rank_1
event_source_rank_2
overall_author_rank
event_author_rank_1
event_author_rank_2
event_impact_score_overall
event_impact_score_entity_1
event_impact_score_entity_2
event_summary_group
event_summary_theme
event_summary_topic
event_summary_action
event_summary_sub-theme
event_summary_acting_party
article_url

Streaming

To stay in sync with the API you make a request with last_id=[lastest document id] then grab the id of the latest document that comes back and repeat. We packaged this logic up in our Accern gem, to install follow the instructions on the repo.

Backfill

The API allows you to access data going back 30 days, anything older we provide via other means. To start from 30 days ago and move forward you have to provide last_id=0. Then continue to hit the API while setting the last_id query string parameter.

Accern Overview

The Accern API provides a comprehensive, REST-based interface for accessing all financial-related articles processed by our platform within the last 30 days.

Each article is processed through our data pipeline, extracted for entities like equities and financial events which are made available through the API.

We also include numerous, insightful analytics like sentiment, impact score, story saturation, etc.

NOTE: Further details in Accern Analytics Section.

Data Attributes

Sample object (article)

Table illustrates all the attributes in a single object (article) of the Accern API response

{
    "id": 1774184,
    "article_id": {
      "$oid": "589b44a569fe9f7f77024f4a"
    },
    "article_sentiment": 0.098,
    "article_traffic": null,
    "article_type": "blog",
    "article_url": "http://feedproxy.google.com/~r/RedmondPie/~3/mF7K04DF1y4/",
    "author_id": null,
    "correlations": null,
    "entities": [
      {
        "name": "Apple Inc.",
        "type": "Public",
        "index": "S&P 500, Russell 1000, Russell 3000, Wilshire 5000, BARRON'S 400, NASDAQ 100",
        "region": "North America",
        "sector": "Technology",
        "ticker": "AAPL",
        "country": "United States",
        "exchange": "NASDAQ",
        "industry": "Computer Manufacturing",
        "entity_id": "EQ0010169500001000",
        "global_id": "BBG000B9XRY4",
        "competitors": [
          "GOOG",
          "HPQ"
        ]
      },
      {
        "name": "Amazon.com, Inc.",
        "type": "Public",
        "index": "S&P 500, Russell 1000, Russell 3000, Wilshire 5000, NASDAQ 100",
        "region": "North America",
        "sector": "Consumer Services",
        "ticker": "AMZN",
        "country": "United States",
        "exchange": "NASDAQ",
        "industry": "Catalog/Specialty Distribution",
        "entity_id": "EQ0021695200001000",
        "global_id": "BBG000BVPV84",
        "competitors": [
          "AAPL",
          "BKS"
        ]
      }
    ],
    "event_author_rank": [
      {
        "author_rank": 4,
        "event_group": "Employment Actions"
      },
      {
        "author_rank": 4,
        "event_group": "Employment Actions"
      }
    ],
    "event_groups": [
      {
        "type": "Recruitment",
        "group": "Employment Actions"
      },
      {
        "type": "Layoff",
        "group": "Employment Actions"
      }
    ],
    "event_impact_score": {
      "overall": 40.88471673254282,
      "on_entities": [
        {
          "entity": "AAPL",
          "on_entity": 31
        },
        {
          "entity": "AMZN",
          "on_entity": 32
        }
      ]
    },
    "event_source_rank": [
      {
        "event_group": "Employment Actions",
        "source_rank": 6
      },
      {
        "event_group": "Employment Actions",
        "source_rank": 6
      }
    ],
    "event_summary": {
      "group": "",
      "theme": "",
      "topic": "",
      "action": "",
      "sub-theme": "",
      "acting_party": ""
    },
    "first_mention": false,
    "harvested_at": "2017-02-08 16:17:39 UTC",
    "overall_author_rank": 5,
    "overall_source_rank": 6,
    "source_id": null,
    "story_id": {
      "$oid": "589a6b6469fe9f7f70ac1df6"
    },
    "story_saturation": "high",
    "story_sentiment": 0.072,
    "story_shares": null,
    "story_volume": 58
  }

Attributes	Type	Description
id	integer	unique id for feed (1 or greater)
article_id.$oid	string	unique id per article
article_sentiment	decimal	determines if article was written positively/negatively (-1.000 - 1.000)
article_type	string	determines the source of an information (ex. blog, article)
article_url	url string	original link to article
entities	list	List of associated equities objects that are identified for this article
entities_name	string	name of the company (8,000+ U.S. public equities)
entities_type	string	Classifying if it’s publicly traded (ex. public)
entities_index	string	Comma-separated string of indices company is listed on
entities_region	string	Region of the company’s headquarters
entities_sector	string	Sector of the company
entities_ticker	string	Ticker of the company
entities_country	string	Country of company’s headquarters
entities_exchange	string	Exchange the company is traded on
entities_industry	string	Industry of the company
entities_entity_id	string	Entity level ID of the company, derived from Bloomberg Open Symbology
entities_global_id	string	Unique global ID of the company, derived from Bloomberg Open Symbology
entities_competitors	list	List of top three competitors associated with the company
event_author_rank	list	Each object indicates the author’s reliability in reporting on specific events
event_groups	list	Each object has a major event group and a subsection of that group
event_groups_type	string	A subsection of an event group for more detail
event_groups_group	string	A major event i.e. event group
event_impact_score	object	Calculates the article’s impact i.e. chance of affecting the associated company’s stock price
event_impact_score_overall	decimal	Determines chance of event affecting stock prices in general by end of trading day
event_impact_score_on_entities	list	Determines chance of event affecting associated company’s stock price by end of trading day
event_source_rank	list	Each object indicates the source’s reliability in reporting on specific events
event_summary_topic	string	Level 1 event category
event_summary_group	string	Level 2 event category
event_summary_theme	string	Level 3 event category
event_summary_sub_theme	string	Level 4 event category
event_summary_action	string	action of an event
event_summary_acting_party	string	parties associated with the event
first_mention	boolean	If this article is the first one to break this new story
harvested_at	datetime	UTC formatted time when Accern received article
overall_author_rank	integer	rank (1-10) of how reliable author is at releasing articles in general
overall_source_rank	integer	rank (1-10) of how reliable source is at releasing articles in general
story_saturation	string	how much exposure this story has currently. ex. high, mid, low
story_sentiment	decimal	+ve/-ve sentiment score of the story by averaging related articles’ sentiment published so far
story_volume	integer	number of articles associated with this story until now

Data Coverage

Brief elaboration on what kind of data and where it comes from. Table of average daily statistics regarding the data processing pipeline.

Types of Data -

Accern acquires information from many types of online sources.

Public News Websites
Public Blogs
Press Releases
Financial Documents ex. SEC Filings
Other Social Media ex. Tumblr

How we Acquire the Data -

Accern has multiple avenues for financial information. They include our own, in-house web scrapers and data obtained via partnered data providers.

Data Providers: Majority of our data comes through our data providers. Currently monitoring around 300million+ sources (websites).
Proprietary Scrapers: Accern has proprietary crawlers that monitor around 0.5million+ public, high-alpha sources. NOTE: These important sources break market-moving news the fastest.

Quick Info on Data Pipeline -

Numbers	Numbers
Total Websites Monitored	300 million-plus
Number of Articles Processed Each Day	5 million-plus
Number of Articles Delivered Each Day	20,000-plus
Format of Data Delivered	JSON, CSV
Real-time Data Delivery Method	REST API, Web Portal
Processing Time Per Article Published	40 milliseconds
Trading Analytics Derived Per Article	10-plus
Archive Date Range Available	08/25/2012 to 08/19/2016 (4 years)
Number of Archive Articles	15 million-plus
Archive Delivery Method	FTP, Dropbox
Data Financial Asset Mapping	Tickers, Bloomberg ID
Financial Assets Coverage	8,000-plus U.S. public equities
Financial Events Coverage	1,000-plus financial events

Story Classification

THOUSANDS of sources/authors post about the SAME news stories resulting in MILLIONS of articles/day.

Accern groups together articles talking about similar information (i.e. certain equities A, B, & C are involved in some financial event D) into UNIQUE STORIES. Accern is the pioneer of this story classification process, allowing you to track how information flows online.

You will know the current exposure of a financial news headline
You will know the overall sentiment of this market event
You will know if the source that just posted a rumor is reliable based on authenticity of previous rumor-related stories published
Many more analytics

In Depth: Story classification model is agnostic of article sentiment and it only groups based on the extracted entities and events. Semantic structure of the article is taken as input. The model identifies important themes and checks in the last 2 weeks for similar themes. If similar theme was found, it groups along with this theme. A combination of machine learning models used to extract entities and then linked back amongst the 8k+ U.S equities and 1k+ financial events.

Accern has compiled multiple entity dictionaries for linking names mentioned in media to the correct entities.

Proprietary Entities Dictionary Accern has developed a proprietary dictionary that maps all ways financial entities are mentioned in the media to the 8,000-plus U.S. public equities.

The dictionary consists of over 150,000 company name variations and includes Bloomberg IDs, ticker symbols, etc. mapped back to these company names. How? - Looking at historical data, we identify the different ways companies have been mentioned to build and update this dictionary.

Proprietary Financial Events Dictionary Accern has developed a proprietary dictionary that maps over 30,000 financial event variations mentioned in media to an aggregated list of 1,000-plus financial events. How? - We worked with financial analysts/equity researchers to figure out an initial list of important financial events. Then, starting with this list and looking at historical data, we figured out different variations of event names.

Noise Cancellation

Accern processes millions of articles/day of which a significant amount ends up as spam/ads/coupons/etc. To tackle this issue, Accern has optimized it’s noise detection algorithms over the years. This noise cancellation mechanism takes around 150M+ articles (articles, blogs, etc.) every day and gives out around 25k articles at the end.

Proprietary Pattern Recognition Spam Detector

Our proprietary noise cancellation mechanism identifies if the input article qualifies as spam and does not contain any insightful information.

How? - A compiled list of regex-expression based phrases are used to classify the articles as spam or not. Using machine learning models and pattern recognition, Accern is able to automatically figure out which articles can be classified as spam based on the way their titles are written.

Example: Save Up to 50% on Your Purchases with Coupons at Amazon.com!

Accern will use pattern recognition to identify future articles talking about coupon codes from Amazon and automatically remove it.

Proprietary Blacklist of Spam Websites Accern has compiled a proprietary list of websites that are known to release spam, irrelevant content to financial markets investors.

How? - Looking in the historical archive of articles posted by different news sources, we calculate the probability that a newly published article from this source will turn out as spam. If the probability is above a threshold, we include it in our blacklist which is constantly kept up to date.

Accern Analytics

Unique Stories (story_id)

What is it? Financial news stories are published on the web and social media in many forms. For ex. articles, & SEC filings. We scour millions of these articles talking about financial events and group them into similar stories. Each story specifies 2 important details. The equities that are being talked about (currently amongst 8000+ US public equities) and a description of the associated financial events (currently 1000+ financial event distinctions available)

Quick Definition: a story is an event that involves a company.

How is it created? Every day we scrape a million+ articles of various types such as blogs, SEC filings and news articles. These mentions go through the story classification model to identify the associated equities (from the existing 8000+ US public equities) and the associated financial event (1000+ distinctions available). All mentions that are similar in the above entities are grouped together into their own stories. Each story which is a combination of companies and events is given unique ids - (story_id).

Examples

An example of a big, viral story regarding Google and Legal Actions consists of articles below that are all talking about similar information.

‘Google alleges Uber stole its self-driving secrets’ by Livemint.com
‘Google accuses Uber of stealing self-drive technology’ by Business-standard.com
‘Lawsuit: Google self-driving car spinout Waymo claims Uber using stolen laser-mapping technology’ by geekwire.com
'Waymo: Uber stole our self-driving car tech’ by cnet.com

An example of a story reporting on a Rumour about Apple can be seen to have been initially posted on a blog before finally showing up on other sources wih bigger reach.

“Apple reportedly plans to 'significantly’ expand Seattle office after Turi acquisition” by bizjournals.com
“Apple plans expansion of artificial intelligence efforts in Seattle” by forums.imore.com

First Mention (first_mention)

What is it? Whenever a financial news story breaks out, many articles (mentions) get published about the same story. First Mention tells us if the article is the FIRST to break that new story.

Quick Definition: A story that has not been mentioned on the internet for at least 2 weeks.

How is it created? The story classification model extracts a theme(combination of event and companies) from input article. It then searches for highly similar themes in the last 2 weeks.

If it finds one, it groups this article with the existing story (theme) and sets it’s first_mention to FALSE. Otherwise, it creates a new story and this article’s first_mention attribute is set to TRUE.

Examples

In brief, the power of knowing an article is the first to break a new story ->

date	headline	first_mention
08:50 AM 28 Feb	Xbox launches Netflix-like service for gamers	TRUE
11:35 AM 28 Feb	GameStop stock price tanks after Microsoft announces new digital-gaming service	FALSE

Sentiment (article_sentiment)

What is it? Sentiment score (-1 - +1) of the article based on title and content.

Quick Definition: determines if the article was written positively or negatively by the author/editor.

How is it created? Sentiment Analysis of articles involves 3 parallel models. (bag of words + n-grams + deep-learning)

Bag of Words involves a proprietary list of 300,000-plus positive and negative words, differently weighed, which are used to gauge a base sentiment of an article.
N-grams involves a proprietary list of positive and negative two- to three-word phrases which are used to gauge a more accurate sentiment in articles. These lists are compiled by financial analysts.
Next, a Deep Learning model predicts how much the article is positive or negative based on the vector representation of its text.
Finally, a meta model (ensemble learning) uses the output of all these 3 models to generate a final score.

Examples

Snippets of articles with a negative sentiment about a publicly traded equity - Tesco Inc

Tesco strike to escalate - “Over 2-thousand staff in 22 Tesco stores will be on strike by the middle of next week. Another 24 stores were balloted for industrial action by Mandate over the past three nights - 6 agreed to join the 16 stores currently on the picket line. The retailer says the results of the ballot mean there’s an onus on the union to call-off the strike….”
Striking Tesco workers will not have Family Income Supplement suspended - “The ongoing strike at a number of Tesco stores has been suspended from this morning after both sides in the dispute agreed to attend discussions at the invitation of the Labour Court. Tesco has confirmed that it will not make any changes to pre-1996 terms and conditions whilst this process is ongoing. The Mandate trade union said all pickets will be suspended and the talks are expected to get under way this weekend….”
Tesco workers shouldn’t lose Family Income Supplement - “This would be a completely unfair and mean spirited move. I believe that there is scope under the current legislation for the Minister to direct that no such decision is made. ‘By engaging in strike action, the workers are already seeing a reduction in their take home pay; a cut to their FIS payment will devastate families….’”

Sentiment (story_sentiment)

What is it? An average of all the sentiment scores of articles that fall into this story until now. NOTE: If there is only one article in a story so far, then its first_mention=TRUE , and story_sentiment equals article_sentiment.

Quick Definition: aggregates article sentiment and calculate the average sentiment for each story

How is it created? Article sentiments calculated via the procedure explained in the above Article Sentiment Section are all aggregated and the average is calculated. As the story grows, the overall sentiment keeps changing and a trend can be captured with time.

Examples

Looking at articles reporting on company earnings of Baidu (BIDU), there exist mixed reviews from different sources. Interestingly, the overall story sentiment saturated to end up as positive as more and more articles were posted.

Initial Negative articles -

Baidu’s quarterly revenue falls 2.6 percent - “Baidu Inc reported a second straight drop in quarterly revenue as regulatory scrutiny into healthcare and related advertisements continued to take a toll on the Chinese internet search giant. The drop, however, was within the 17.84-18.38 billion yuan range the company had previously forecast. Analysts estimate that healthcare accounts for about 20-30 percent of Baidu’s search revenue, which represents more than 80 percent of the company’s total sales….”

Dispersed Positive articles -

Baidu reports stable 2016 revenue growth - “Chinese Internet giant Baidu reported stable revenue growth in 2016, helped in part by artificial intelligence (AI) upgrades to its various products. Baidu continued to see stable user growth for its search and map services, with its mobile payment business Baidu Wallet attracting 100 million users by the end of 2016, surging 88 percent compared with 2015….”
Baidu posts bleak fourth quarter, but sees business reshuffle driving 2017 growth - “Baidu Inc’s revenue fell for a second straight quarter, hurt by a government crackdown on healthcare advertising, but the internet search giant expects a rebound this year as it retools to find growth outside its core ad business. The company has stumbled over the past few years - firstly from a cash-burning subsidy war with rivals such as Alibaba in businesses like food delivery, movie tickets and taxi hailing….”

Accern Rank(overall_source_rank)

What is it? Accern Rank identifies if the information from a source is posted promptly and if that information will go viral (similar articles published by others). Rank 1 is lowest and Rank 10 is highest. In other words, it lets you know which sources usually are among the first to publish articles on a new story and also informs if they have a knack at posting on stories that become wide spread.

Quick Definition:

overall_source_rank - determines if the SOURCE is reliable at releasing trending stories. i.e. reliable indicating ability to post early and trending indicating potential of wide spread stories.
overall_author_rank - determines if the AUTHOR is reliable at releasing trending stories.

How is it created? A graphical model takes into account historical data (past articles), how certain news appeared in the past and how the distribution of articles within a story looked like. It checks in the past, which sources posted faster in comparison to other sources which then posted contextually similar articles.

Examples

Overall Source Rank (High) - StreetInsider releases stories first, and their stories get republished by many other sources.
Overall Author Rank (Low-Mid) - John Paul^* releases stories on StreetInsider first, but his stories don’t get republished by any other authors.

Accern Rank(event_source_rank)

What is it? Ranks are based on the same Accern Rank model which tries to predict promptness and ability to post republished stories. Rank 1 is lowest and Rank 10 is highest. event_source_rank/event_author_rank is more precise. For ex. Tumblr posts rumors faster than others. Bloomberg posts financial docs faster than others. It would be prudent to the client to notice that sources will have varied ranks for different events.

Quick Definition:

event_source_rank - determines if the SOURCE is reliable at releasing articles associated with a financial event.
event_author_rank - determines if the AUTHOR is reliable at releasing articles associated with a financial event..

How is it created? The ranking model is the same. Ranks are calculated by filtering based on financial events.

Examples

Event Source Rank (High) - StreetInsider releases lawsuit stories first, and their lawsuit stories get republished by many other sources.
Event Author Rank (Low-Mid) - John Paul^* releases lawsuit stories on StreetInsider late, but his stories are republished by some authors.

Saturation (story_saturation)

What is it? The online exposure of the story i.e. have a lot of people seen this story/information already? This is one of the useful metrics possible thanks to story classification.

Quick Definition: gauge the current potential exposure of a story

How is it created? Based on web traffic information (provided by Alexa Rank) of related articles and previous historical data, we predict the exposure of the story into different levels - low, mid, & high.

3-Step Process for Computing Saturation

Step 1: accumulate web traffic per story We accumulate total web traffic based on all related articles per story
Step 2: average web traffic per story We look back at all similar stories’ total web traffic and take an average per story
Step 3: segment average story web traffic We segment the average story web traffic into low, mid, and high saturation

Examples

Saturation (High) - story published on 100+ websites
Saturation (High) - story published on two major newswires
Saturation (Low) - story published on one medium-traffic website
Saturation (Low) - story published on five small websites

Impact (_overall)

What is it? When a certain type of story appears in media, calculate the probability that the stock price of the company moves up/down by more than 1% by EOD. Overall Impact score checks how an event like Company Earnings generally has high impact compared to other events. Overall Impact Score is an average across all the entity_impact_scores for different companies.

Quick Definition (event_impact_score_overall): determines if an event has a chance of affecting stock prices of companies in general by more than 1% at the end of the trading day.

How is it created? An example of a retrospective metric. Looking back in history archive, how market behaved for a past similar event is evaluated. In brief, by overlaying 3+ years of financial events data with stock prices market data, we determine if the event has a chance of moving the stock price of companies in general by more than 1% by the end of the day.

Examples

Event Impact Score Overall (High) - In the past 3 years, whenever a lawsuit happened, it affected the stock prices of companies in general by 1% or more EOD.

Impact (_on_entities)

What is it? Entity Impact Score is more precise as it it returns a probability that particular event will affect SPECIFIC equities. For example, event_impact_score_on_entities of 90 for Apple and ‘Mergers & Acquisition’ event conveys this sort of story/theme moved the market before in the past and there is a high likelihood now as well. Also, this event can vary in impact score for different companies.

Quick Definition (event_impact_score_on_entities:) determines if an event has a chance of affecting the stock price of the mentioned company by more than 1% at the end of the trading day.

How is it created? Using the same procedure to calculate overall impact score of an event, this process filters based on every entity and calculates respective probabilities.

Examples

Sample Snippet

{
    "....": "...",
    "event_groups": [
      {
        "type": "Financial Results",
        "group": "Company Earnings"
      }
    ],
    "event_impact_score": {
      "overall": 48.55540720961282,
      "on_entities": [
        {
          "entity": "EBAY",
          "on_entity": 26
        },
        {
          "entity": "AMZN",
          "on_entity": 36
        }
      ]
    }
}

Considering the right-side snippet, we see that the results of Company Earnings reports have a lower impact on EBay and Amazon compared to all the companies overall.
Similarly, an event involving Criminal Actions/Fraud may have a high overall impact (event_impact_score.overall), but certain entities like Google are impacted 50% less. (event_impact_score.on_entities.on_entity)