Usage and Request Parameters

The simplest way to use Full-Text RSS is to use the form provided.

In the URL field, enter the URL of a partial feed or web page and click ‘Create Feed’. The resulting page should show you a newly generated feed with the full content. To use this feed in your application, copy the URL in the address bar. You can now use this new URL in place of the original partial feed URL.

If you're a developer and need to integrate Full-Text RSS in your application, we have a simple code example to give you an idea of how it can be used.

Form Options

In addition to the URL, you can also specify a number of other options in the form:

Max Items

Set the maximum number of feed items we should process. The smaller the number, the faster the new feed is produced.

If your URL refers to a standard web page, this will have no effect: you will only get 1 item.

Link Handling

By default, links within the content are preserved. Change this field if you'd like links removed, or included as footnotes.

Extraction Failure Handling

If the extraction pattern above fails to match, FTR can remove the item from the feed or keep it in.

Keeping the item will keep the title, URL and original description (if any) found in the feed. In addition, FTR inserts a message before the original description notifying you that extraction failed.

Include Excerpt

Check the box and we'll include a brief plain text excerpt from the extracted content in the output.

JSON Output

We'll output JSON if selected (useful if your application already parses JSON and you want to avoid importing an RSS parsing library).

Debug

Check the box to see what's happening behind the scenes.

Query String Parameters

Using the form is the simplest way to create a Full-Text RSS URL, but you can also construct one yourself. The form fields above are turned into query string parameters when you submit the form. Let's look at those parameters here, and a few more that are not presented on the form.

These parameters are to be appended on to the base URL. The base URL is where you installed Full-Text RSS, e.g. http://example.org/full-text-rss/makefulltextfeed.php. Because this will differ from installation to installation, in this guide we'll simply use makefulltextfeed.php in examples.

These parameters can be combined in the URL.

A note on encoding: if you're constructing URLs without using the form, make sure you URL encode the parameter values (anything after the '=' and before the '&'). In PHP the function to use is urlencode(). If you're doing it by hand, you can paste the parameter values into the form field at http://meyerweb.com/eric/tools/dencoder/ and click 'Encode' to get the encoded the value.

Parameter Value Description
url string (URL)

This is the only required parameter. It should be the URL to a partial feed or a standard HTML page. You can omit the ‘http://’ prefix if you like.

Example: convert www.example.org/feed to a full-text feed
makefulltextfeed.php?url=www.example.org%2Ffeed

Note: %2F is the encoded value for '/'

format rss (default), json

The default Full-Text RSS output is RSS. The only other valid output format is JSON. To get JSON output, pass format=json in the querystring. Exclude it from the URL (or set it to ‘rss’) if you’d like RSS.

Example: return results as JSON
makefulltextfeed.php?url=www.example.org%2Ffeed&format=json
summary 0 (default), 1

If set to 1, an excerpt will be included for each item in the output.

Example: return excerpt and full content
makefulltextfeed.php?url=www.example.org%2Ffeed&summary=1
content 0, 1 (default)

If set to 0, the extracted content will not be included in the output.

Example: don't return full content
makefulltextfeed.php?url=www.example.org%2Ffeed&content=0
Example: return only excerpts
makefulltextfeed.php?url=www.example.org%2Ffeed&content=0&summary=1
links preserve (default), footnotes, remove

Links can either be preserved, made into footnotes, or removed. None of these options affect the link text, only the hyperlink itself.

Example
makefulltextfeed.php?url=www.example.org%2Ffeed&links=remove
exc 0 (default), 1

If Full-Text RSS fails to extract the article body, the generated feed item will include a message saying extraction failed followed by the original item description (if present in the original feed). You ask Full-Text RSS to remove such items from the generated feed completely by passing 1 in this parameter.

Example
makefulltextfeed.php?url=www.example.org%2Ffeed&exc=1
html 0 (default), 1

Treat input source as HTML (or parse-as-html-first mode). To enable, pass html=1 in the querystring. If enabled, Full-Text RSS will not attempt to parse the response as a feed. This increases performance slightly and should be used if you know that the URL is not a feed.

Note: If excluded, or set to 0, Full-Text RSS first tries to parse the server’s response as a feed, and only if it fails to parse as a feed will it revert to HTML parsing. In the default parse-as-feed-first mode, Full-Text RSS will identify itself as PHP first and only if a valid feed is returned will it identify itself as a browser in subsequent requests to fetch the feed items. In parse-as-html-first mode, Full-Text RSS will identify itself as a browser from the very first request.

Example
makefulltextfeed.php?url=www.example.org%2Farticle.html&html=1
xss 0 (default), 1

Use this to enable XSS filtering. We have not enabled this by default because we assume the majority of our users do not display the HTML retrieved by Full-Text RSS in a web page without further processing. If you subscribe to our generated feeds in your news reader application, it should, if it’s good software, already filter the resulting HTML for XSS attacks, making it redundant for Full-Text RSS do the same. Similarly with frameworks/CMSs which display feed content - the content should be treated like any other user-submitted content.

If you are writing an application yourself which is processing feeds generated by Full-Text RSS, you can either filter the HTML yourself to remove potential XSS attacks or enable this option. This might be useful if you are processing our generated feeds with JavaScript on the client side - although there’s client side xss filtering available too.

If enabled, we’ll pass retrieved HTML content through htmLawed (safe flag on and style attributes denied). Note: if enabled this will also remove certain elements you may want to preserve, such as iframes.

Example: enable xss filtering
makefulltextfeed.php?url=www.example.org%2Farticle.html&xss
callback string

This is for JSONP use. If you’re requesting JSON output, you can also specify a callback function (Javascript client-side function) to receive the Full-Text RSS JSON output.

lang 0, 1 (default), 2, 3

Language detection. If you’d like Full-Text RSS to find the language of the articles it processes, you can use one of the following values:

0
Ignore language
1
Use article metadata (e.g. HTML lang attribute) or feed metadata. (Default value)
2
As above, but guess the language if it’s not specified.
3
Always guess the language, whether it’s specified or not.

If language detection is enabled and a match is found, the language code will be returned in the <dc:language> element inside the <item> element.

debug [no value], rawhtml, parsedhtml

If this parameter is present, Full-Text RSS will output the steps it is taking behind the scenes to help you debug problems.

If the parameter value is rawhtml, Full-Text RSS will output the HTTP response (headers and body) of the first response after redirects.

If the parameter value is parsedhtml, Full-Text RSS will output the reconstructed HTML (after its own parsing). This version is what the extraction rules are applied to, and it may differ from the original (rawhtml) output. If your extraction rules are not picking out any elements, this will likely help identify the problem.

Note: Full-Text RSS will stop execution after HTML output if one of the last two parameter values are passed. Otherwise it will continue showing debug output until the end.

parser html5php, libxml

The default parser is libxml as it’s the fastest. HTML5-PHP is an HTML5 parser implemented in PHP. It’s slower than libxml, but can often produce better results. You can request HTML5-PHP be used as the parser in a site-specific config file (to ensure it gets used for all URLs for that site), or explicitly via this request parameter.

proxy 0, 1, string (proxy name)

This parameter has no effect if proxy servers have not been entered in the config file. If they have been entered and enabled, you can pass the following values: 0 to disable proxy use (uses direct connection). 1 for default proxy behaviour (whatever is set in the config), or a string to identify a specific proxy server (has to match the name given to the proxy in the config file).

Feed-only parameters — These parameters only apply to web feeds. They have no effect when the input URL points to a web page.

Parameter Value Description
use_extracted_title [no value]

By default, if the input URL points to a feed, item titles in the generated feed will not be changed - we assume item titles in feeds are not truncated. If you’d like them to be replaced with titles Full-Text RSS extracts, use this parameter in the request (the value does not matter). To enable/disable this for for all feeds, see the config file - specifically $options->favour_feed_titles

Example
makefulltextfeed.php?url=www.example.org%2Ffeed&use_extracted_title
max number

The maximum number of feed items to process. See section on max items in form options above. (The default and upper limit will be found in the configuration file.)

Example: process first two items from feed
makefulltextfeed.php?url=www.example.org%2Ffeed&max=2

Did you find this article helpful?