OCR

All flavors welcome.
Forum rules
Be kind.
Post Reply
User avatar
richmond62
Posts: 3048
Joined: Sun Sep 12, 2021 11:03 am
Location: Bulgaria
Contact:

OCR

Post by richmond62 »

https://richmondmathewson.owlstown.net/
User avatar
tperry2x
Posts: 1783
Joined: Tue Dec 21, 2021 9:10 pm
Location: Britain (Previously known as Great Britain)
Contact:

Re: OCR

Post by tperry2x »

That link does still go to a page that exists (amazingly), but the only download is a file called "rev" no file extension.
Doesn't seem to be a valid stack file:
Screenshot at 2024-03-18 09-40-07.png
Screenshot at 2024-03-18 09-40-07.png (8.35 KiB) Viewed 1698 times
User avatar
richmond62
Posts: 3048
Joined: Sun Sep 12, 2021 11:03 am
Location: Bulgaria
Contact:

Re: OCR

Post by richmond62 »

--- BASHING REMOVED ----

Where have all those example stacks gone?
https://richmondmathewson.owlstown.net/
User avatar
tperry2x
Posts: 1783
Joined: Tue Dec 21, 2021 9:10 pm
Location: Britain (Previously known as Great Britain)
Contact:

Re: OCR

Post by tperry2x »

In fact, the contents of that file is xhtml:

Code: Select all

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>


        <meta name="description" content="Gateway to resources for new and experienced LiveCode developers." />
        <meta name="keywords" content="" />
        <meta name="viewport" content="width=device-width, minimum-scale=1.0, maximum-scale=2.0" />
        <link rel="shortcut icon" type="image/x-icon" href="https://livecode.com/wp-content/themes/livecode2013/ims/livecode_favicon.png">
        <link rel="stylesheet" href="https://livecode.com/wp-content/themes/livecode2013/css/normalize.css">
<link type="text/css" href="https://livecodeshare.runrev.com/styles.css" rel="stylesheet" media="screen" />
<link type="text/css" href="https://livecodeshare.runrev.com/foldbar.css" rel="stylesheet" media="screen" />
<link type="text/css" href="https://livecodeshare.runrev.com/code.css" rel="stylesheet" media="screen" />
<link type="text/css" rel="stylesheet" href="https://livecodeshare.runrev.com/comments.css" />
<link type="text/css" rel="stylesheet" href="https://livecodeshare.runrev.com/runrev-custom.css" />
<link type="text/css" rel="stylesheet" href="https://livecodeshare.runrev.com/css/companynav.css" />
<link type="text/css" rel="stylesheet" href="https://livecodeshare.runrev.com/css/runrev.css" />
<link type="text/css" rel="stylesheet" href="https://livecodeshare.runrev.com/css/companynav-custom.css" />

        <link rel="stylesheet" href="https://livecode.com/wp-content/themes/livecode2013/style.css">
        <link rel='stylesheet' id='admin-bar-css'  href='https://livecode.com/wp-includes/css/admin-bar.min.css?ver=3.5.1' type='text/css' media='all' />
<link rel='stylesheet' id='q-a-plus-css'  href='https://livecode.com/wp-content/plugins/q-and-a/css/q-a-plus.css?ver=1.0.6.2' type='text/css' media='screen' />
<link rel='stylesheet' id='hubspot-css'  href='https://livecode.com/wp-content/plugins/hubspot/css/hubspot.css?ver=3.5.1' type='text/css' media='all' />
<link rel='stylesheet' id='core3.0-css'  href='https://livecode.com/wp-content/plugins/wp-syntaxhighlighter/syntaxhighlighter3/styles/shCore.css?ver=3.0' type='text/css' media='all' />
<link rel='stylesheet' id='core-Default3.0-css'  href='https://livecode.com/wp-content/plugins/wp-syntaxhighlighter/syntaxhighlighter3/styles/shCoreDefault.css?ver=3.0' type='text/css' media='all' />
<link rel='stylesheet' id='theme-Default3.0-css'  href='https://livecode.com/wp-content/plugins/wp-syntaxhighlighter/syntaxhighlighter3/styles/shThemeDefault.css?ver=3.0' type='text/css' media='all' />



<script type="text/javascript">
var s_url = 'https://livecodeshare.runrev.com';
</script>
<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js"></script>
<script type="text/javascript" src='https://livecodeshare.runrev.com/include/revonline.js'></script>
<title>revOnline | RunRev</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>

<body style="background-color: white;">




<div id="struct-contents">
	<div id="struct-holder">
        <div id="struct-page">
        	<div id="struct-page-banner" style="background:URL('https://livecodeshare.runrev.com/images/banner-revonline-home.png') left top no-repeat; height:140px">
        	<div id="search-form-wrapper">
          	<form id="search-form" method="post" action='https://livecodeshare.runrev.com/search/'>
             <div id="search-form-right"><input id="search-button" type="image" name="Search" value="Search" src='https://livecodeshare.runrev.com/images/revonline-search.gif' /></div>
          	 <div id="seach-form-left"><input id="search-term" type="text" name="term" value="Enter Search Term" onclick="javascript: searchSetText('');" onblur="javascript: searchResetText();" /></div>
          	 </form>
        	</div>
          </div>
            <div id="struct-page-middle-holder">
            	<div id="struct-page-middle-content" >
      </div>
   </div>

 	</div>
 </div>
<div id="image-menu-bg"></div>
<div id="image-menu-top"></div>
<div id="image-menu-bottom"></div>

</body>
</html>
... and produces this:
Screenshot at 2024-03-18 10-46-59.png
Screenshot at 2024-03-18 10-46-59.png (14.31 KiB) Viewed 1695 times
User avatar
richmond62
Posts: 3048
Joined: Sun Sep 12, 2021 11:03 am
Location: Bulgaria
Contact:

Re: OCR

Post by richmond62 »

Well, that's useful, isn't it . . . . . . . . :(

I tried the Wayback Machine to no avail.
https://richmondmathewson.owlstown.net/
User avatar
OpenXTalkPaul
Posts: 1815
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: OCR

Post by OpenXTalkPaul »

Edit: Oh, that was one of the LCS + JS demos from Hermann Hoch, he took a few of his stacks off of that site out of frustration at one point, and sadly they aren't in any WaybackMachine snap shot either.
The library is included in the stack, so you can use it offline.
The License is GNU GENERAL PUBLIC v3.0
It sounds like this was a stack with the JavaScript (Emscripten webasm) port here:
https://github.com/antimatter15/ocrad.js
Pasted in to HTML and used via Browser Widget.

There's a library version of the original Ocrdad as seen here: https://packages.debian.org/bullseye/libocrad-dev
so that could be wrapped with Extension Builder binding strings to use the library more directly from our scripts.

If you have newer macOS Ventura and above, instant Optical Character Recognition is built into the OS now.
User avatar
OpenXTalkPaul
Posts: 1815
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: OCR

Post by OpenXTalkPaul »

This might be a good Extension Builder FFI starter project for someone to try because
there does not appear to be all that many functions to wrap in that library:

OCRAD.set_print
OCRAD.write_file
OCRAD.read_file
OCRAD.read_text
OCRAD.delete_file
OCRAD.version
OCRAD.open
OCRAD.close
OCRAD.get_errno
OCRAD.set_image
OCRAD.set_image_from_file
OCRAD.set_exportfile
OCRAD.add_filter
OCRAD.set_utf8_format
OCRAD.set_threshold
OCRAD.scale
OCRAD.transform
OCRAD.recognize
OCRAD.result_blocks
OCRAD.result_lines
OCRAD.result_chars_total
OCRAD.result_chars_block
OCRAD.result_chars_line
OCRAD.result_line
OCRAD.result_first_character
OCRAD._simple

Probably primarily need to wrap "OCRAD.set_image_from_file" and "OCRAD.recognize" and "OCRAD.result_chars_block" handlers
User avatar
richmond62
Posts: 3048
Joined: Sun Sep 12, 2021 11:03 am
Location: Bulgaria
Contact:

Re: OCR

Post by richmond62 »

If you have newer macOS Ventura and above, instant Optical Character Recognition is built into the OS now.
Well aware of that as have used it several times with my MacOS Sonoma box to 'rejuvenate' academic PDF documents from the early 90s of my wife (i.e. PDFs with no embedded text layer).

However, "as we all know" (why does that phrase make me want to throw up?), not everyone has the dubious benefits of either MacOS 13 or 14 . . .
https://richmondmathewson.owlstown.net/
User avatar
OpenXTalkPaul
Posts: 1815
Joined: Sat Sep 11, 2021 4:19 pm
Contact:

Re: OCR

Post by OpenXTalkPaul »

richmond62 wrote: Tue Mar 26, 2024 7:44 am
If you have newer macOS Ventura and above, instant Optical Character Recognition is built into the OS now.
Well aware of that as have used it several times with my MacOS Sonoma box to 'rejuvenate' academic PDF documents from the early 90s of my wife (i.e. PDFs with no embedded text layer).

However, "as we all know" (why does that phrase make me want to throw up?), not everyone has the dubious benefits of either MacOS 13 or 14 . . .
But if you DO have a recent macOS machine, access to one (or otherwise use some alternative method to run a recent macOS, perhaps via the magic 'the cloud'), Then you can save an image like that's had it's text recognized, as PDF and THEN you can use OXT Apple PDF wrapper extension to extract the plain text (possibly even Rich Text) that you can then use for anything anywhere.

I'm certainly interested in what open-source alternatives are available, and I'd consider taking a crack at wrapping the OCR library mentioned earlier in this thread if I didn't already have a full plate.
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests