Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion from MyBibleZone results in extraneous line breaks at beginning of verses #104

Open
1n5aN1aC opened this issue Jan 13, 2025 · 6 comments

Comments

@1n5aN1aC
Copy link

This is only a minor issue, but I noticed the old ESword exporter that required extra tools had extra instructions for reducing unwanted whitespace.

Today, I converted multiple MyBibleZone formatted versions to E-Sword using the new ESwordV11 exporter.

First, I wanted to say than you very much for making this tool, it worked amazingly!

However, some of the versions do still seem to have extraneous whitespace, especially newlines at the beginning of verses.

Is that something that the new exporter is supposed to solve? Or is this a problem with the new MyBibleZone format (#78)

@schierlm
Copy link
Owner

The unwanted whitespace with ToolTipTool was a bug in ToolTipTool, even resulting in newlines in the middle of tags where it toppled over its own feet and caused errors. It has nothing to do with whitespace that may be in the module before which will be reproduced. BibleMultiConverter will not magically create good quality modules from bad ones.

I am not aware of any whitespace import issues in MyBibleZone either (#89 is about formatting tags).

That being said, if you believe it is a bug in the converter (especially when whitespace does not appear in MyBible.Zone but does after conversion), please tell me which module you converted (or attach it here if not available from the repository) and in which verse you see the whitespace.

You can also try to first convert to StrippedDiffable with the StripLineBreaks option to make sure there are no line breaks left in the module before converting to ESwordV11. If line breaks appear in the module then (that exist in the resulting SQlite file and are not hallucinated by the viewer) there is a bug in the export module. Note that this will remove all line breaks, not only "unwanted" ones at the beginning of verses.

@1n5aN1aC
Copy link
Author

1n5aN1aC commented Jan 13, 2025

I discovered the source of the "newlines", and I do not believe them to be directly related to this tool.

By looking at the SQlite file, all the affected verses have a lone <p> or <br /> at the beginning of the verse.

  • This is probably intentional in the original MyBibleZone format, as I think that format expects to display verses inline, so the <p> tag gets used to force a newline where needed.
  • In E-Sword, the most common viewing system is each verse on it's own line, so the E-Sword file format should not have newlines between verses.

Personally, I would say the most 'proper' solution would be adding code to the ESwordV11 exporter to remove any newline / whitespace / <p> / <br /> tags at the beginning of verses, but I'll admit the exporter code is confusing me, so I'll just whip together some SQL to remove them for myself.

Will close the issue and post the SQL I come up with in case wants to do similar in the future.

@schierlm
Copy link
Owner

schierlm commented Jan 13, 2025

Most MyBibleZone modules I've encountered so far (some years ago...) have the <pb/> at the end of a verse, not at the beginning of the next one. Otherwise the verse number would end up at the end of the line and the rest of the verse is in a new line. This is also true when converting to many other formats, in which the verse number would end up stranded.

I guess if they were at the end of the line, ESword would also be able to cope with them just fine.

In case modules like this become more common, maybe I can add a StrippedDiffable option to StripLineBreaksAtVerseStart or similar - or discard them altogether on MyBibleZone import or (try to) move them to the end of the previous verse where they belong (I believe).

@1n5aN1aC
Copy link
Author

1n5aN1aC commented Jan 13, 2025

Seems to me a lot of them have <p> tags at the beginning of verses, even quite common / popular versions.

On this page: https://www.ph4.org/b4_1.php?l=en&q=mybible

I just tested the following, which when converted using only java -jar .\BibleMultiConverter-AllInOneEdition.jar MyBibleZone '.\inputfile.SQLite3' ESwordV11 "OutputFile" have <p> tags or <br /> at the beginning of verses:

  • OEB (2000 verses start with <br />, 2457 start with <p>)
  • RSV (7150 start with <br />, 4991 start with <p>)
  • NASB2020 (7065 start with <br />, 3663 start with <p>)
  • NASB1997 (with strong's)
  • ESV
  • CSB
  • NIV 2011

...But not all of them in this format. At least the older versions of NIV don't have them, and Logos 1972, and ASV+ doesn't either.

@1n5aN1aC
Copy link
Author

Here's the SQL I was using both to band-aid the problem, and to find the counts:

UPDATE Bible SET Scripture = Substr(scripture, 1, 0) || "" || Substr(scripture, 1 + 3) WHERE Scripture LIKE '<p>%';

UPDATE Bible SET Scripture = Substr(scripture, 1, 0) || "" || Substr(scripture, 1 + 6) WHERE Scripture LIKE '<br />%';

@schierlm
Copy link
Owner

Indeed, I just had a look at the latest RSV module (both in the app and when converting) and the line breaks are all at the beginning of lines. Yet the app does not cut off the verse numbers any longer. So I'll have to update my converter to better handle this by moving the line breaks to the previous verse (in case there is no headline in between or the verse is not the first one).

I will update the issue metadata accordingly :)

@schierlm schierlm changed the title Conversion to ESwordV11 still has many extraneous line breaks Conversion from MyBibleZone results in extraneous line breaks at beginning of verses Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants