Thanks. I'll raise the issue with our webmaster. We're using Avada.
I only asked because it did seem possible, in WPFTS, to alter the format of the items displayed under the Title in the search results, but not the Title itself.
Best posts made by Nick
-
RE: Altering the way Search Results are displayed.
-
Excerpt Text Peculiarities
I have a problem with the excerpt text WPFTS is displaying in the search results. It seems to be selecting some, but not all, of the text surrounding the search term, almost as though some text in the paragraph did not belong there. By way of example, here is some text that WPFTS has Indexed correctly from one of our documents:
"Tuborg Brewery with red and green straw hats, so familiar a sight on the streets of Copenhagen. JEOFFRY SPENCETHE BRISTOL AND SOUTH WALES UNION RAILWAY, John Norris, 32 pp, 5 photo illus, 2 maps, soft covers. RCHS 1985, ISBN 0-901461-38-5 £2.40 + p&p.
The rail journey between Bristol and South Wales was shortened by the Severn Bridge in 1879 and again by the Severn Tunnel in 1886, but an earlier scheme to avoid the detour via Gloucester utilised a combination of ferry and rail travel. For that purpose the Bristol and South Wales Union Railway Company was incorporated in 1857. An existing ferry had to be improved and various difficulties overcome before the new link could be formally opened on 1 January 1864."
When I searched for "ISBN 0-901461-38-5" the excerpt was "RCHS 1985, ISBN 0-901461-38–5 £2.40 + p&p.".
When I searched for "incorporated in 1857" the excerpt was "For that purpose the Bristol and South Wales Union Railway Company was incorporated in 1857."
When I searched for "via Gloucester" the excerpt was "The rail journey between Bristol and South Wales was shortened by the Severn Bridge in 1879 and again by the Severn Tunnel in 1886, but an earlier scheme to avoid the detour via Gloucester utilised a combination of ferry and rail travel."
The text in these three examples was continuous from the index, but much shorter than the 500 characters I had specified in the WPFTS settings.
However, when I searched for "BRISTOL AND SOUTH WALES UNION RAILWAY", the excerpt was "JEOFFRY SPENCE THE BRISTOL AND SOUTH WALES UNION RAILWAY, John Norris, 32 pp, 5 photo illus, 2 maps, soft covers. For that purpose the Bristol and South Wales Union Railway Company was incorporated in 1857." So here there is a whole sentence and more missing out of the middle of the excerpt.
Could you investigate, please?
-
RE: PDF Search Results: Titles and Excerpts
Many thanks for the advice. We've implemented it, and our search results now show in a single column across the page, which is exactly what we wanted.
-
Search Results - BOOLEAN Operators and Relevance
Our implementation of WPFTS is configured with the Default Search Logic set to "AND".
I did a search of our website for "2015 Committee", both with and without the quotation marks (the results were the same), and the second and third most relevant results did indeed contain the phrase "2015 Committee"; the second twice and the third once
However, the top result did not. It was a Bibliography for the year 2015, and contained "Committee" nine times, and "2015" 962 times, but in completely different parts of the document.
This feels more like an "OR" result than an "AND" one, but maybe I misunderstand how the "AND" and "OR" operators work? -
RE: PDF Search Results: Titles and Excerpts
@EpsilonAdmin said in PDF Search Results: Titles and Excerpts:
#main .post h2.fusion-post-title a {
font-size: 21px;
}
#main .post h2.fusion-post-title {
margin-bottom: 5px;
}Thank you for your further advice. I added your code to the end of the code in the Custom CSS Styling dialog, but decided that a font-size of 20px, and a margin-bottom of 0px, worked best for our website. The results can be seen by inserting some text (try "Worcester") in the search box that is top-right here: https://rchs.org.uk/
-
Searching data attached to image files
Having loaded our 1,800 pdf documents (magazines etc) on to our website, and successfully implemented WPFTS to search within them, we are now turning our attention to the 40,000 historic photographs in our archive. Around half have been digitised, mainly as jpegs but with some tifs, and we are working on the remainder.
Each photo will, in due course, be annotated with a description and I would like to know if other users have experience of using WPFTS to search this type of information. Currently, the digital images are stored on hard drives, and the information is on spreadsheets, but we are researching software to combine the two, to enable the images to be published on our website, and for the scanning and researching process to be suitable for collaborative teamworking.
Latest posts made by Nick
-
Searching data attached to image files
Having loaded our 1,800 pdf documents (magazines etc) on to our website, and successfully implemented WPFTS to search within them, we are now turning our attention to the 40,000 historic photographs in our archive. Around half have been digitised, mainly as jpegs but with some tifs, and we are working on the remainder.
Each photo will, in due course, be annotated with a description and I would like to know if other users have experience of using WPFTS to search this type of information. Currently, the digital images are stored on hard drives, and the information is on spreadsheets, but we are researching software to combine the two, to enable the images to be published on our website, and for the scanning and researching process to be suitable for collaborative teamworking. -
Search Results - BOOLEAN Operators and Relevance
Our implementation of WPFTS is configured with the Default Search Logic set to "AND".
I did a search of our website for "2015 Committee", both with and without the quotation marks (the results were the same), and the second and third most relevant results did indeed contain the phrase "2015 Committee"; the second twice and the third once
However, the top result did not. It was a Bibliography for the year 2015, and contained "Committee" nine times, and "2015" 962 times, but in completely different parts of the document.
This feels more like an "OR" result than an "AND" one, but maybe I misunderstand how the "AND" and "OR" operators work? -
RE: Excerpt Text Peculiarities
Your explanation helps a lot, and I think adding a summary to your documentation would help other people too.
Originally, I had assumed that each occurrence of the search term in the document would produce a separate result with a 500 word excerpt "wrapped around" the search term. However, if I understand correctly, each document containing the search term only returns one result, and the excerpt might include a number of sentences from different parts of the document, these generally being the shortest sentences found (as you've described), up to the 500 word limit?
When these excerpt sentences are not continuous in the document, perhaps they could be numbered and placed in new paragraphs, to make clear that they are not a continuous section of text from the document?
Your idea of adding text at the end of the excerpt to signify when there are further "good" sentences, would also help. Maybe "there are "X" further appearances in other parts of this document". (You could leave the number "X" out if the software can't provide the number). -
RE: Excerpt Text Peculiarities
@Nick said in Excerpt Text Peculiarities:
BRISTOL AND SOUTH WALES UNION RAILWAY
To me, the main problem is that the excerpt text is not a continuous copy of the text in the original document, because it has thrown out a sentence in the middle of the relevant paragraph. For someone reading the excerpt, this missing sentence might be vital in order to understand the context of the search term within the document.
I'm sure there will always be situations where any algorithm creates anomalies, but my current view is that the excerpt should always be a continuous copy of the original.
Where to start and end the excerpt is more tricky, but paragraph breaks might be good indicators, better still a double paragraph break (i.e. a blank line in the text). In the example above, the text above the blank line (containing "Tuborg") belongs to a completely different topic, and is irrelevant to the search term.
It might also help if the specified character limit was more fully used. We have ours set to 500 (I assume this is characters), but in some cases we are getting excerpts of well under 100 characters.
I'm assuming here that WPFTS only returns a single result for a document containing the search term, even thought the search term might appear several times in various parts of the document? How does it decide which excerpt to display, and would it be possible to add a flag in the search results to state something like "Search term appears a further x times in the document"?
Nick -
Excerpt Text Peculiarities
I have a problem with the excerpt text WPFTS is displaying in the search results. It seems to be selecting some, but not all, of the text surrounding the search term, almost as though some text in the paragraph did not belong there. By way of example, here is some text that WPFTS has Indexed correctly from one of our documents:
"Tuborg Brewery with red and green straw hats, so familiar a sight on the streets of Copenhagen. JEOFFRY SPENCETHE BRISTOL AND SOUTH WALES UNION RAILWAY, John Norris, 32 pp, 5 photo illus, 2 maps, soft covers. RCHS 1985, ISBN 0-901461-38-5 £2.40 + p&p.
The rail journey between Bristol and South Wales was shortened by the Severn Bridge in 1879 and again by the Severn Tunnel in 1886, but an earlier scheme to avoid the detour via Gloucester utilised a combination of ferry and rail travel. For that purpose the Bristol and South Wales Union Railway Company was incorporated in 1857. An existing ferry had to be improved and various difficulties overcome before the new link could be formally opened on 1 January 1864."
When I searched for "ISBN 0-901461-38-5" the excerpt was "RCHS 1985, ISBN 0-901461-38–5 £2.40 + p&p.".
When I searched for "incorporated in 1857" the excerpt was "For that purpose the Bristol and South Wales Union Railway Company was incorporated in 1857."
When I searched for "via Gloucester" the excerpt was "The rail journey between Bristol and South Wales was shortened by the Severn Bridge in 1879 and again by the Severn Tunnel in 1886, but an earlier scheme to avoid the detour via Gloucester utilised a combination of ferry and rail travel."
The text in these three examples was continuous from the index, but much shorter than the 500 characters I had specified in the WPFTS settings.
However, when I searched for "BRISTOL AND SOUTH WALES UNION RAILWAY", the excerpt was "JEOFFRY SPENCE THE BRISTOL AND SOUTH WALES UNION RAILWAY, John Norris, 32 pp, 5 photo illus, 2 maps, soft covers. For that purpose the Bristol and South Wales Union Railway Company was incorporated in 1857." So here there is a whole sentence and more missing out of the middle of the excerpt.
Could you investigate, please?
-
RE: "Prevent Direct Access" Plugin
Thank you so much. We'll try this, and I'll let you know the results.
-
RE: "Prevent Direct Access" Plugin
Good morning, we are using the Members Plugin (by Memberpress) version 3.1.3.
I'm very grateful that you are taking an interest in this topic. One of the, perhaps, unintended consequences of WPFTS is, obviously, that it makes it a lot easier to find documents that were previously difficult to find because they were hiding in full sight amongst, in our case, hundreds of other documents. As I've said before, often it is just inconvenient and might lead to a minor loss of new members, but in other cases it could have very adverse effects on their businesses. I think this is one of the topics that is worth discussion, even though the solution will sometimes lie with a different plugin.
By the way, we did try having the "attachment" page visible for a while, but it displayed an enormous image of the document thumbnail, with a much smaller copy that contained the link below it. The link was, therefore, effectively hidden by the large image. Hence, using it as we have discussed but not being able to see it (at least, not for more than a second or two) would be very good.. Nick -
RE: "Prevent Direct Access" Plugin
I know that some users will want to make accessing media files impossible, which I believe involves a plugin like "Prevent Direct Access".
On our website, our pdfs are copies of our magazines, and we would be content with just making it difficult for non-members to access them.
Is it possible to use the internal "attachment" page of the file to do this?
We could make this page members only, but could we also set up an automatic redirection command from this page to the URL of the document? Hence, a non-members would not see this page but would be taken to the "Join Us" (instead of 404) page. Members would not see it either, because they would be automatically redirected to the document.
Might this work? -
"Prevent Direct Access" Plugin
Has anyone had experience of using this plugin with WPFTS? We thought it might be the answer to preventing non-members of our Society from seeing protected pdf files - i.e. the ones that our members pay subscriptions in order to see.
We installed the free version as a trial but, in addition to a number of other shortcomings, we discovered that non-members could find references/excerpts to the protected files in the search results using WPFTS search (which is good), but that clicking on the search result opened the protected file (which is bad). There is an option to buy a "Gold" upgrade to the plugin, but I'd like some confidence it will work before paying, and a query to the developer hasn't yet been responded to.
I did exchange some emails on protecting files with Epsilon a couple of months ago, but I thought this plugin might offer a straightforward way to protect our pdfs. -
RE: WPFTS Not Recognising Columns in a PDF Document
@EpsilonAdmin I've sent the requested information by email.