NAV Navbar
Logo
shell ruby python r

Accern API

Authentication

curl "https://feed.accern.com/v3/alphas?token=TOKEN"
require 'uri'
require 'net/http'

url = URI("https://feed.accern.com/v3/alphas?token=TOKEN")
http = Net::HTTP.new(url.host, url.port)
request = Net::HTTP::Get.new(url)
response = http.request(request)
puts response.read_body
import requests

url = "https://feed.accern.com/v3/alphas?token=TOKEN"
req = requests.get(url)
text_response = req.text # read response as Text
print(text_response)

json_response = req.json() # read response as JSON
print(json_response)
if (!require("jsonlite")) install.packages("jsonlite")

url <- "https://feed.accern.com/v3/alphas?token=TOKEN"
response <- fromJSON(url)
print(response)

Make sure to replace TOKEN above

To authenticate provide your authentication token in the url. We provide the authentication token in your welcome email.

Feed

curl "https://feed.accern.com/v3/alphas?token=TOKEN"
url = URI("https://feed.accern.com/v3/alphas?token=TOKEN")
url = "https://feed.accern.com/v3/alphas?token=TOKEN"
url <- "https://feed.accern.com/v3/alphas?token=TOKEN"

The above url returns the most recent 100 documents.

[
  {
    "id": 1774184,
    "article_id": {
      "$oid": "589b44a569fe9f7f77024f4a"
    },
    "article_sentiment": 0.098,
    "article_traffic": null,
    "article_type": "blog",
    "article_url": "http://feedproxy.google.com/~r/RedmondPie/~3/mF7K04DF1y4/",
    "author_id": null,
    "correlations": null,
    "entities": [
      {
        "name": "Apple Inc.",
        "type": "Public",
        "index": "S&P 500, Russell 1000, Russell 3000, Wilshire 5000, BARRON'S 400, NASDAQ 100",
        "region": "North America",
        "sector": "Technology",
        "ticker": "AAPL",
        "country": "United States",
        "exchange": "NASDAQ",
        "industry": "Computer Manufacturing",
        "entity_id": "EQ0010169500001000",
        "global_id": "BBG000B9XRY4",
        "competitors": [
          "GOOG",
          "HPQ"
        ]
      },
      {
        "name": "Amazon.com, Inc.",
        "type": "Public",
        "index": "S&P 500, Russell 1000, Russell 3000, Wilshire 5000, NASDAQ 100",
        "region": "North America",
        "sector": "Consumer Services",
        "ticker": "AMZN",
        "country": "United States",
        "exchange": "NASDAQ",
        "industry": "Catalog/Specialty Distribution",
        "entity_id": "EQ0021695200001000",
        "global_id": "BBG000BVPV84",
        "competitors": [
          "AAPL",
          "BKS"
        ]
      }
    ],
    "event_author_rank": [
      {
        "author_rank": 4,
        "event_group": "Employment Actions"
      },
      {
        "author_rank": 4,
        "event_group": "Employment Actions"
      }
    ],
    "event_groups": [
      {
        "type": "Recruitment",
        "group": "Employment Actions"
      },
      {
        "type": "Layoff",
        "group": "Employment Actions"
      }
    ],
    "event_impact_score": {
      "overall": 40.88471673254282,
      "on_entities": [
        {
          "entity": "AAPL",
          "on_entity": 31
        },
        {
          "entity": "AMZN",
          "on_entity": 32
        }
      ]
    },
    "event_source_rank": [
      {
        "event_group": "Employment Actions",
        "source_rank": 6
      },
      {
        "event_group": "Employment Actions",
        "source_rank": 6
      }
    ],
    "event_summary": {
      "group": "",
      "theme": "",
      "topic": "",
      "action": "",
      "sub-theme": "",
      "acting_party": ""
    },
    "first_mention": false,
    "harvested_at": "2017-02-08 16:17:39 UTC",
    "overall_author_rank": 5,
    "overall_source_rank": 6,
    "source_id": null,
    "story_id": {
      "$oid": "589a6b6469fe9f7f70ac1df6"
    },
    "story_saturation": "high",
    "story_sentiment": 0.072,
    "story_shares": null,
    "story_volume": 58
  }
]

GET https://feed.accern.com/v3/alphas?token=TOKEN

By default this request will return the most recent 100 documents.

Filtering

Filter by last_id

curl "https://feed.accern.com/v3/alphas?last_id=1774184&token=TOKEN"
url = URI("https://feed.accern.com/v3/alphas?last_id=1774184&token=TOKEN")
url = "https://feed.accern.com/v3/alphas?last_id=1774184&token=TOKEN"
url <- "https://feed.accern.com/v3/alphas?last_id=1774184&token=TOKEN"

Filter by index

curl "https://feed.accern.com/v3/alphas?index=sp500&token=TOKEN"
url = URI("https://feed.accern.com/v3/alphas?index=sp500&token=TOKEN")
url = "https://feed.accern.com/v3/alphas?index=sp500&token=TOKEN"
url <- "https://feed.accern.com/v3/alphas?index=sp500&token=TOKEN"

Filter by multiple indexes

curl "https://feed.accern.com/v3/alphas?index=sp500,dow30&token=TOKEN"
url = URI("https://feed.accern.com/v3/alphas?index=sp500,dow30&token=TOKEN")
url = "https://feed.accern.com/v3/alphas?index=sp500,dow30&token=TOKEN"
url <- "https://feed.accern.com/v3/alphas?index=sp500,dow30&token=TOKEN"

Filter by ticker

curl "https://feed.accern.com/v3/alphas?ticker=amzn&token=TOKEN"
url = URI("https://feed.accern.com/v3/alphas?ticker=amzn&token=TOKEN")
url = "https://feed.accern.com/v3/alphas?ticker=amzn&token=TOKEN"
url <- "https://feed.accern.com/v3/alphas?ticker=amzn&token=TOKEN"

Filter by multiple tickers

curl "https://feed.accern.com/v3/alphas?ticker=aapl,amzn&token=TOKEN"
url = URI("https://feed.accern.com/v3/alphas?ticker=aapl,amzn&token=TOKEN")
url = "https://feed.accern.com/v3/alphas?ticker=aapl,amzn&token=TOKEN"
url <- "https://feed.accern.com/v3/alphas?ticker=aapl,amzn&token=TOKEN"
Parameter Description
last_id Returns the latest 100 documents that came after the provided id. Used to prevent duplicates while keeping in sync (see streaming section).
index Filters documents by the index, see below table for supported indexes. To filter by multiple indexes pass a comma separated list of indexes.
ticker Filters documents by ticker. To filter by multiple tickers pass a comma separated list of tickers.

Allowed index values

index expected query string value
S&P 500 sp500
Russell 1000 russell1000
Russell 3000 russell3000
Wilshire 5000 wilshire5000
Barron’s 400 barrons400
DOW 30 dow30

File Format

curl "https://feed.accern.com/v3/alphas.csv?token=TOKEN"
url = URI("https://feed.accern.com/v3/alphas.csv?token=TOKEN")
url = "https://feed.accern.com/v3/alphas.csv?token=TOKEN"
url <- "https://feed.accern.com/v3/alphas.csv?token=TOKEN"

By default the response from the API feed is in JSON format. But if you append .csv you will get the data in CSV format.

CSV Columns
id
article_id
story_id
harvested_at
entities_name_1
entities_ticker_1
entities_global_id_1
entities_entity_id_1
entities_type_1
entities_exchange_1
entities_sector_1
entities_industry_1
entities_country_1
entities_region_1
entities_index_1
entities_competitors_1
entities_name_2
entities_ticker_2
entities_global_id_2
entities_entity_id_2
entities_type_2
entities_exchange_2
entities_sector_2
entities_industry_2
entities_country_2
entities_region_2
entities_index_2
entities_competitors_2
event_groups_group_1
event_groups_type_1
event_groups_group_2
event_groups_type_2
story_sentiment
story_saturation
story_volume
first_mention
article_type
article_sentiment
overall_source_rank
event_source_rank_1
event_source_rank_2
overall_author_rank
event_author_rank_1
event_author_rank_2
event_impact_score_overall
event_impact_score_entity_1
event_impact_score_entity_2
event_summary_group
event_summary_theme
event_summary_topic
event_summary_action
event_summary_sub-theme
event_summary_acting_party
article_url

Streaming

To stay in sync with the API you make a request with last_id=[lastest document id] then grab the id of the latest document that comes back and repeat. We packaged this logic up in our Accern gem, to install follow the instructions on the repo.

Backfill

The API allows you to access data going back 30 days, anything older we provide via other means. To start from 30 days ago and move forward you have to provide last_id=0. Then continue to hit the API while setting the last_id query string parameter.

Accern Overview

The Accern API provides a comprehensive, REST-based interface for accessing all financial-related articles processed by our platform within the last 30 days.

Each article is processed through our data pipeline, extracted for entities like equities and financial events which are made available through the API.

We also include numerous, insightful analytics like sentiment, impact score, story saturation, etc.

NOTE: Further details in Accern Analytics Section.

Data Attributes

Sample object (article)

Table illustrates all the attributes in a single object (article) of the Accern API response

{
    "id": 1774184,
    "article_id": {
      "$oid": "589b44a569fe9f7f77024f4a"
    },
    "article_sentiment": 0.098,
    "article_traffic": null,
    "article_type": "blog",
    "article_url": "http://feedproxy.google.com/~r/RedmondPie/~3/mF7K04DF1y4/",
    "author_id": null,
    "correlations": null,
    "entities": [
      {
        "name": "Apple Inc.",
        "type": "Public",
        "index": "S&P 500, Russell 1000, Russell 3000, Wilshire 5000, BARRON'S 400, NASDAQ 100",
        "region": "North America",
        "sector": "Technology",
        "ticker": "AAPL",
        "country": "United States",
        "exchange": "NASDAQ",
        "industry": "Computer Manufacturing",
        "entity_id": "EQ0010169500001000",
        "global_id": "BBG000B9XRY4",
        "competitors": [
          "GOOG",
          "HPQ"
        ]
      },
      {
        "name": "Amazon.com, Inc.",
        "type": "Public",
        "index": "S&P 500, Russell 1000, Russell 3000, Wilshire 5000, NASDAQ 100",
        "region": "North America",
        "sector": "Consumer Services",
        "ticker": "AMZN",
        "country": "United States",
        "exchange": "NASDAQ",
        "industry": "Catalog/Specialty Distribution",
        "entity_id": "EQ0021695200001000",
        "global_id": "BBG000BVPV84",
        "competitors": [
          "AAPL",
          "BKS"
        ]
      }
    ],
    "event_author_rank": [
      {
        "author_rank": 4,
        "event_group": "Employment Actions"
      },
      {
        "author_rank": 4,
        "event_group": "Employment Actions"
      }
    ],
    "event_groups": [
      {
        "type": "Recruitment",
        "group": "Employment Actions"
      },
      {
        "type": "Layoff",
        "group": "Employment Actions"
      }
    ],
    "event_impact_score": {
      "overall": 40.88471673254282,
      "on_entities": [
        {
          "entity": "AAPL",
          "on_entity": 31
        },
        {
          "entity": "AMZN",
          "on_entity": 32
        }
      ]
    },
    "event_source_rank": [
      {
        "event_group": "Employment Actions",
        "source_rank": 6
      },
      {
        "event_group": "Employment Actions",
        "source_rank": 6
      }
    ],
    "event_summary": {
      "group": "",
      "theme": "",
      "topic": "",
      "action": "",
      "sub-theme": "",
      "acting_party": ""
    },
    "first_mention": false,
    "harvested_at": "2017-02-08 16:17:39 UTC",
    "overall_author_rank": 5,
    "overall_source_rank": 6,
    "source_id": null,
    "story_id": {
      "$oid": "589a6b6469fe9f7f70ac1df6"
    },
    "story_saturation": "high",
    "story_sentiment": 0.072,
    "story_shares": null,
    "story_volume": 58
  }
Attributes Type Description
id integer unique id for feed (1 or greater)
article_id.$oid string unique id per article
article_sentiment decimal determines if article was written positively/negatively (-1.000 - 1.000)
article_type string determines the source of an information (ex. blog, article)
article_url url string original link to article
entities list List of associated equities objects that are identified for this article
entities_name string name of the company (8,000+ U.S. public equities)
entities_type string Classifying if it’s publicly traded (ex. public)
entities_index string Comma-separated string of indices company is listed on
entities_region string Region of the company’s headquarters
entities_sector string Sector of the company
entities_ticker string Ticker of the company
entities_country string Country of company’s headquarters
entities_exchange string Exchange the company is traded on
entities_industry string Industry of the company
entities_entity_id string Entity level ID of the company, derived from Bloomberg Open Symbology
entities_global_id string Unique global ID of the company, derived from Bloomberg Open Symbology
entities_competitors list List of top three competitors associated with the company
event_author_rank list Each object indicates the author’s reliability in reporting on specific events
event_groups list Each object has a major event group and a subsection of that group
event_groups_type string A subsection of an event group for more detail
event_groups_group string A major event i.e. event group
event_impact_score object Calculates the article’s impact i.e. chance of affecting the associated company’s stock price
event_impact_score_overall decimal Determines chance of event affecting stock prices in general by end of trading day
event_impact_score_on_entities list Determines chance of event affecting associated company’s stock price by end of trading day
event_source_rank list Each object indicates the source’s reliability in reporting on specific events
event_summary_topic string Level 1 event category
event_summary_group string Level 2 event category
event_summary_theme string Level 3 event category
event_summary_sub_theme string Level 4 event category
event_summary_action string action of an event
event_summary_acting_party string parties associated with the event
first_mention boolean If this article is the first one to break this new story
harvested_at datetime UTC formatted time when Accern received article
overall_author_rank integer rank (1-10) of how reliable author is at releasing articles in general
overall_source_rank integer rank (1-10) of how reliable source is at releasing articles in general
story_saturation string how much exposure this story has currently. ex. high, mid, low
story_sentiment decimal +ve/-ve sentiment score of the story by averaging related articles’ sentiment published so far
story_volume integer number of articles associated with this story until now

Data Coverage

Brief elaboration on what kind of data and where it comes from. Table of average daily statistics regarding the data processing pipeline.

Types of Data -

Accern acquires information from many types of online sources.

How we Acquire the Data -

Accern has multiple avenues for financial information. They include our own, in-house web scrapers and data obtained via partnered data providers.

Quick Info on Data Pipeline -

Numbers Numbers
Total Websites Monitored 300 million-plus
Number of Articles Processed Each Day 5 million-plus
Number of Articles Delivered Each Day 20,000-plus
Format of Data Delivered JSON, CSV
Real-time Data Delivery Method REST API, Web Portal
Processing Time Per Article Published 40 milliseconds
Trading Analytics Derived Per Article 10-plus
Archive Date Range Available 08/25/2012 to 08/19/2016 (4 years)
Number of Archive Articles 15 million-plus
Archive Delivery Method FTP, Dropbox
Data Financial Asset Mapping Tickers, Bloomberg ID
Financial Assets Coverage 8,000-plus U.S. public equities
Financial Events Coverage 1,000-plus financial events

Story Classification

THOUSANDS of sources/authors post about the SAME news stories resulting in MILLIONS of articles/day.

Accern groups together articles talking about similar information (i.e. certain equities A, B, & C are involved in some financial event D) into UNIQUE STORIES. Accern is the pioneer of this story classification process, allowing you to track how information flows online.

In Depth: Story classification model is agnostic of article sentiment and it only groups based on the extracted entities and events. Semantic structure of the article is taken as input. The model identifies important themes and checks in the last 2 weeks for similar themes. If similar theme was found, it groups along with this theme. A combination of machine learning models used to extract entities and then linked back amongst the 8k+ U.S equities and 1k+ financial events.

Accern has compiled multiple entity dictionaries for linking names mentioned in media to the correct entities.

Proprietary Entities Dictionary Accern has developed a proprietary dictionary that maps all ways financial entities are mentioned in the media to the 8,000-plus U.S. public equities.

The dictionary consists of over 150,000 company name variations and includes Bloomberg IDs, ticker symbols, etc. mapped back to these company names. How? - Looking at historical data, we identify the different ways companies have been mentioned to build and update this dictionary.

Proprietary Financial Events Dictionary Accern has developed a proprietary dictionary that maps over 30,000 financial event variations mentioned in media to an aggregated list of 1,000-plus financial events. How? - We worked with financial analysts/equity researchers to figure out an initial list of important financial events. Then, starting with this list and looking at historical data, we figured out different variations of event names.

Noise Cancellation

Accern processes millions of articles/day of which a significant amount ends up as spam/ads/coupons/etc. To tackle this issue, Accern has optimized it’s noise detection algorithms over the years. This noise cancellation mechanism takes around 150M+ articles (articles, blogs, etc.) every day and gives out around 25k articles at the end.

Proprietary Pattern Recognition Spam Detector

Our proprietary noise cancellation mechanism identifies if the input article qualifies as spam and does not contain any insightful information.

How? - A compiled list of regex-expression based phrases are used to classify the articles as spam or not. Using machine learning models and pattern recognition, Accern is able to automatically figure out which articles can be classified as spam based on the way their titles are written.

Example: Save Up to 50% on Your Purchases with Coupons at Amazon.com!

Accern will use pattern recognition to identify future articles talking about coupon codes from Amazon and automatically remove it.

Proprietary Blacklist of Spam Websites Accern has compiled a proprietary list of websites that are known to release spam, irrelevant content to financial markets investors.

How? - Looking in the historical archive of articles posted by different news sources, we calculate the probability that a newly published article from this source will turn out as spam. If the probability is above a threshold, we include it in our blacklist which is constantly kept up to date.

Accern Analytics

Unique Stories (story_id)

What is it? Financial news stories are published on the web and social media in many forms. For ex. articles, & SEC filings. We scour millions of these articles talking about financial events and group them into similar stories. Each story specifies 2 important details. The equities that are being talked about (currently amongst 8000+ US public equities) and a description of the associated financial events (currently 1000+ financial event distinctions available)

Quick Definition: a story is an event that involves a company.

How is it created? Every day we scrape a million+ articles of various types such as blogs, SEC filings and news articles. These mentions go through the story classification model to identify the associated equities (from the existing 8000+ US public equities) and the associated financial event (1000+ distinctions available). All mentions that are similar in the above entities are grouped together into their own stories. Each story which is a combination of companies and events is given unique ids - (story_id).

Examples

An example of a big, viral story regarding Google and Legal Actions consists of articles below that are all talking about similar information.

An example of a story reporting on a Rumour about Apple can be seen to have been initially posted on a blog before finally showing up on other sources wih bigger reach.

First Mention (first_mention)

What is it? Whenever a financial news story breaks out, many articles (mentions) get published about the same story. First Mention tells us if the article is the FIRST to break that new story.

Quick Definition: A story that has not been mentioned on the internet for at least 2 weeks.

How is it created? The story classification model extracts a theme(combination of event and companies) from input article. It then searches for highly similar themes in the last 2 weeks.

If it finds one, it groups this article with the existing story (theme) and sets it’s first_mention to FALSE. Otherwise, it creates a new story and this article’s first_mention attribute is set to TRUE.

Examples

In brief, the power of knowing an article is the first to break a new story ->

date headline first_mention
08:50 AM 28 Feb Xbox launches Netflix-like service for gamers TRUE
11:35 AM 28 Feb GameStop stock price tanks after Microsoft announces new digital-gaming service FALSE

Sentiment (article_sentiment)

What is it? Sentiment score (-1 - +1) of the article based on title and content.

Quick Definition: determines if the article was written positively or negatively by the author/editor.

How is it created? Sentiment Analysis of articles involves 3 parallel models. (bag of words + n-grams + deep-learning)

Examples

Snippets of articles with a negative sentiment about a publicly traded equity - Tesco Inc

Sentiment (story_sentiment)

What is it? An average of all the sentiment scores of articles that fall into this story until now. NOTE: If there is only one article in a story so far, then its first_mention=TRUE , and story_sentiment equals article_sentiment.

Quick Definition: aggregates article sentiment and calculate the average sentiment for each story

How is it created? Article sentiments calculated via the procedure explained in the above Article Sentiment Section are all aggregated and the average is calculated. As the story grows, the overall sentiment keeps changing and a trend can be captured with time.

Examples

Looking at articles reporting on company earnings of Baidu (BIDU), there exist mixed reviews from different sources. Interestingly, the overall story sentiment saturated to end up as positive as more and more articles were posted.

Initial Negative articles -

Dispersed Positive articles -

Accern Rank(overall_source_rank)

What is it? Accern Rank identifies if the information from a source is posted promptly and if that information will go viral (similar articles published by others). Rank 1 is lowest and Rank 10 is highest. In other words, it lets you know which sources usually are among the first to publish articles on a new story and also informs if they have a knack at posting on stories that become wide spread.

Quick Definition:

How is it created? A graphical model takes into account historical data (past articles), how certain news appeared in the past and how the distribution of articles within a story looked like. It checks in the past, which sources posted faster in comparison to other sources which then posted contextually similar articles.

Examples

*

Accern Rank(event_source_rank)

What is it? Ranks are based on the same Accern Rank model which tries to predict promptness and ability to post republished stories. Rank 1 is lowest and Rank 10 is highest. event_source_rank/event_author_rank is more precise. For ex. Tumblr posts rumors faster than others. Bloomberg posts financial docs faster than others. It would be prudent to the client to notice that sources will have varied ranks for different events.

Quick Definition:

How is it created? The ranking model is the same. Ranks are calculated by filtering based on financial events.

Examples

Saturation (story_saturation)

What is it? The online exposure of the story i.e. have a lot of people seen this story/information already? This is one of the useful metrics possible thanks to story classification.

Quick Definition: gauge the current potential exposure of a story

How is it created? Based on web traffic information (provided by Alexa Rank) of related articles and previous historical data, we predict the exposure of the story into different levels - low, mid, & high.

3-Step Process for Computing Saturation

Examples

Impact (_overall)

What is it? When a certain type of story appears in media, calculate the probability that the stock price of the company moves up/down by more than 1% by EOD. Overall Impact score checks how an event like Company Earnings generally has high impact compared to other events. Overall Impact Score is an average across all the entity_impact_scores for different companies.

Quick Definition (event_impact_score_overall): determines if an event has a chance of affecting stock prices of companies in general by more than 1% at the end of the trading day.

How is it created? An example of a retrospective metric. Looking back in history archive, how market behaved for a past similar event is evaluated. In brief, by overlaying 3+ years of financial events data with stock prices market data, we determine if the event has a chance of moving the stock price of companies in general by more than 1% by the end of the day.

Examples

Impact (_on_entities)

What is it? Entity Impact Score is more precise as it it returns a probability that particular event will affect SPECIFIC equities. For example, event_impact_score_on_entities of 90 for Apple and ‘Mergers & Acquisition’ event conveys this sort of story/theme moved the market before in the past and there is a high likelihood now as well. Also, this event can vary in impact score for different companies.

Quick Definition (event_impact_score_on_entities:) determines if an event has a chance of affecting the stock price of the mentioned company by more than 1% at the end of the trading day.

How is it created? Using the same procedure to calculate overall impact score of an event, this process filters based on every entity and calculates respective probabilities.

Examples

Sample Snippet

{
    "....": "...",
    "event_groups": [
      {
        "type": "Financial Results",
        "group": "Company Earnings"
      }
    ],
    "event_impact_score": {
      "overall": 48.55540720961282,
      "on_entities": [
        {
          "entity": "EBAY",
          "on_entity": 26
        },
        {
          "entity": "AMZN",
          "on_entity": 36
        }
      ]
    }
}