Export: The Good, The Bad, and The Weird discussion

25 views
How to fill the holes in an export file?

Comments Showing 1-23 of 23 (23 new)    post a comment »
dateUp arrow    newest »

message 1: by Mesembryanthemum (last edited Jul 17, 2023 03:59PM) (new)

Mesembryanthemum | 34 comments Does anyone have any tips or suggestions about how to gather information that’s not included in an export file? (Here's a list of missing items.)

I’d like to have complete information for a number of reasons. Of course, I want to preserve the data that I see when I view a book page, or an edition list, or my shelves. I’d also like to sort export file’s rows by the “Sort-by title”, but the export file doesn’t include that field. And I REALLY want to have all edition-specific book descriptions, in case I need to fix bad changes to the books on my shelves. (Sometimes, book descriptions get erased when a bot or inexperienced human combines editions.) And much more.

I know I could look at each book page and enter the missing information by hand, but I export often (several times a month), and I’ve shelved thousands of books. That’s a lot of manual changes.

Any ideas? Any pointers to possible solutions or areas for constructive research?


P.S. This post is distilled from an off-topic thread that I posted in the librarian’s group (at this link). Export functionality is off-topic for that group, so please keep discussions in this group.


message 2: by Elizabeth (Alaska) (last edited Jul 18, 2023 12:56PM) (new)

Elizabeth (Alaska) Thank you for opening the group, Mes. I know how frustrating this whole thing is for you.

With your mention in the other group about ASINs, I have taken the time over the past few days to add the ASIN to my Private Notes field, which *does* export.

I made mistakes along the way, not realizing that I could add the ASIN to the visible fields on My Books. :-(

When I finally (!) realized I could do this, the process was a *lot* easier. Still, I saw that some Kindle editions had ISBN-10s in the ASIN field. I don't know whether I shelved the wrong edition or what, but I was able to go to Amazon and get the ASIN currently assigned to my purchase and switch editions.

The other thing was that some of my Kindle editions had no information in the ASIN field. These had become Alternate Cover Editions. For most of these, there wasn't even the ACE verbiage in the description. I was able to get the ASIN from the librarian notes and added that to my Private Notes field.

I did this only for those books I have On Hand and not yet read. I didn't make much attempt to do this for those I have read. For those of you with more than the 350+ that I worked with, you have my sympathy. Even for me this was a several day process, but I think worth doing because now I have the information on the Export.


message 3: by Dobby (new)

Dobby (dobby0390) | 4 comments Thanks for these work-arounds, Elizabeth! What a lot of work you did. The copy/paste (or copy/drag) of ASINs is a wonderful find.


Elizabeth (Alaska) Also, I just checked. The Private Notes field will accept 512 characters, so would also accept some of the other missing information, though you might have to use abbreviations.


Mesembryanthemum | 34 comments Using the "Private Notes" field is a great tip. Even if it's fussy to set up, once the info is there, it will stay there. Double thanks for researching the maximum length of this field.


message 6: by Elizabeth (Alaska) (last edited Jul 18, 2023 02:20PM) (new)

Elizabeth (Alaska) I might say that going forward, when I add a Kindle book to my shelves, I will automatically add the ASIN to the Notes field.

I may consider adding the ISBN for physical copies as I acquire them.

The bot is deleting records. I'd like to retain the basic info for the books I have/read.

Mes, you want a lot more from your Export than do I. I don't see descriptions and other info you'd like ever exporting, even if GR should get its act together. I don't know whether you want to set up auxiliary files for yourself separate from the GR Export. I do know that if you chose to do so, the info would always be available to you. Lots of work, but ...


message 7: by Mesembryanthemum (last edited Jul 18, 2023 03:33PM) (new)

Mesembryanthemum | 34 comments Elizabeth (Alaska) wrote: "Mes, you want a lot more from your Export than do ... ... ... I do know that if you chose to do so, the info would always be available to you. Lots of work, but ..."

Yes, I admit that I'm at the "fringe end" of wanting to protect everything from possible MadBot damage, including the books I've catalogued or corrected but not shelved.

But I'm also lazy. (LOL) Doing a lot of extra work, by hand, when I add or fix a book is just not realistic. Possible, but not likely.


Elizabeth (Alaska) Mesembryanthemum wrote: "But I'm also lazy. (LOL) Doing a lot of extra work, by hand, when I add or fix a book is just not realistic. Possible, but not likely."

Me too, the lazy part. But you inspired me. And nothing *has* to be done today or even this week (or month).


message 9: by Shaz (last edited Jul 18, 2023 06:39PM) (new)

Shaz | 9 comments I agree with Elizabeth, Mes. I'm happy with the information I get, although yes, I would like the ASIN to be included.

Like Elizabeth I'm now just adding the ASIN to every new book I'm adding, although I'll admit, I'm updating both GR and my spreadsheet. I don't like where GR is heading with the database in the shape it is now. And I'd hate to lose all my data!

ETA I'd also like all the additional read dates to show up. I'm adding them individually at the moment, but it's a pain, lol. I did try the trick that was suggested by another poster in the other group, but I haven't done an export since to see if it worked.


Elizabeth (Alaska) I don’t know how to write Macros. But I think someone who does know how to do it could easily add the sort by title info. If you know people with some expertise, perhaps one would help with this.


message 11: by Elizabeth (Alaska) (last edited Jul 19, 2023 03:17PM) (new)

Elizabeth (Alaska) Update on the Private Notes field: I *think* that line returns eat up characters. In the export file one of my notes is with a double return between works and UK - I have added spaces so that the html displays:

Multiple Works< br/ >< br/ >UK, 1815-1882


message 12: by Mesembryanthemum (new)

Mesembryanthemum | 34 comments Elizabeth (Alaska) wrote: "Update on the Private Notes field: I *think* that line returns eat up characters."

I'm pretty sure that you're right. For most computer programs, a line return is considered to be one character. Two lines returns would be two characters. Whenever you see a < br/ > or <br> in the export file, that's one line-return character.

Space is a valuable commodity, even for computers.


message 13: by Mesembryanthemum (new)

Mesembryanthemum | 34 comments P.S. And thanks so much for continuing to update us on your research. It's extremely helpful!


Elizabeth (Alaska) Mesembryanthemum wrote: "For most computer programs, a line return is considered to be one character."

This is good to know. It looked to me like several characters. I was thinking about replacing with a ;

But none of my notes nears the limit yet, so I can continue to be lazy. ;-)


message 15: by Mesembryanthemum (last edited Jul 20, 2023 12:54PM) (new)

Mesembryanthemum | 34 comments Or I might be wrong...

I had another think, and now I wonder if perhaps the < br/ > dealies might be considered as several characters (as many as 6). It might be that the thing that counts characters is just looking at the "text" for the character and doesn't care that they translate to a single character "underneath".

Perhaps the ; would be better.

Edit to correct some words and numbers.


Elizabeth (Alaska) OK! I will edit if any of my entries begin to look long.


message 17: by Elizabeth (Alaska) (last edited Jul 20, 2023 01:43PM) (new)

Elizabeth (Alaska) No, you were right the first time. A return is 1 character. A space is one character.

I opened one of the books (edit review page is where it shows characters for the Notes field) where I have that double return entry, opened the Private Notes field and compared characters remaining with characters used.

I'm not sure how the html code shows up on the Export, but possibly because it first comes over in CSV format.


message 18: by Mesembryanthemum (new)

Mesembryanthemum | 34 comments I finally checked my export files. The HTML code is preserved in the export files, with all the < and > and / and . characters. It's probably safest to assume that those "extra" characters will count towards the 250-character limit.

For example, some of my reviews have "fully clickable links" that use this format:
<a href="https : // www . something . com / otherthing / stuff">linktext< /a> 
(without any spaces around the colon and slashes and periods, of course, which are there so you can see the HTML code)

It would save several characters to omit the HTML code and use a shortened version of the link (again, without the spaces):
something . com / otherthing / stuff 

This example would save 26 characters, plus whatever I used for "linktext".


message 19: by Elizabeth (Alaska) (last edited Aug 14, 2023 05:38AM) (new)

Elizabeth (Alaska) Reviews have a much larger character limit - 20,000. That allows for very long reviews. I have never hit the character limit and I have seen many reviews that are much longer than those I write. And the Export file receives whatever Goodreads allows in those fields. The Export file doesn't have a limit.

I only referenced the Private Notes field with a 512 character limit.


message 20: by Mesembryanthemum (new)

Mesembryanthemum | 34 comments Oops! Yes, of course, you clearly said "Private Notes" above. My tiny brain skipped that detail and focused only on the reviews section. My bad.

Thanks for the character limit for GR reviews. This is also very good to know.


message 21: by Mesembryanthemum (last edited Aug 14, 2023 11:58PM) (new)

Mesembryanthemum | 34 comments Here's a tidbit of new information so those clicking on this thread will at least see something more than my "oops" comment above. This one's for newbies (and non-newbies, like me, who aren't good at using the GR website).

Each shelf has a "Settings" link at the top, next to "Batch edit". You can use it to display the ASINs for the books on your shelf, plus much more. AHA! (What can I say — I'm not good with a complicated UI that has tiny, text-only links.)

I investigated, of course, because I want to automate the process of saving the ASINs for my Kindle books. Alas, I'm not there yet, but this is the first step.

WHAT I DID:

1. Set my shelf display to "table view" and enabled the dreaded "infinite scroll" option.

2. Changed my settings to check every available column, like this:



3. Scrolled down so that all books on the shelf are displayed on this page.

4. Copied the contents of the shelf -- for example, by using my mouse to click-and-drag to select the header row (optional) and all books information -- then pasted it into a text file.

THE RESULTS:

If you copied the heading row, it looks like this (note that ^I signifies a tab character):
cover ^Ititle Up arrow ^Iauthor ^Iisbn ^Iisbn13 ^Iasin ^Inum pages ^Iavg rating ^Inum ratings ^Idate pub ^Idate pub (ed.) ^Irating ^Ishelves ^Ireview ^Inotes ^ I^Icomments ^Ivotes ^Iread count ^Idate started ^Idate read ^Idate added ^I^Iowned ^I^I^Iformat ^I

The header is followed by 57 lines for each book "row". Each item for the book is on its own line, following by a blank line (though it's actually the TAB character, so it's not truly blank).

The title is on the 3rd line of the entry. The ASIN is on the 11th line (it's the third long number in the entry).

Here's an example book entry:
The Palm-Wine Drinkard

The Palm-Wine Drinkard

Tutuola, Amos

0571049966

9780571049967

0571049966

125 pp

3.77

2,939

1952

1977

1 of 5 stars
2 of 5 stars
3 of 5 stars
[ 4 of 5 stars ]
5 of 5 stars

read, я-2020-read
[edit]

Like reading a dream. I loved it. [edit]

None [edit]

0

0

1

Dec 25, 2020 [edit]

Dec 25, 2020 [edit]

Dec 26, 2020


Paperback [edit]

edit
view »
Remove from my books

Someday, I'll figure out how to extract just the title and ASIN from each entry, so that I have a simple list of the ASINs to add to each review's Private Notes field. But my Linux shell-scripting skills are very rusty, so that will have to wait.


message 22: by Elizabeth (Alaska) (last edited Aug 15, 2023 09:00AM) (new)

Elizabeth (Alaska) The lack of ASINs in the Export file is why we were discussing the Private Notes field.

See my Post #2 on the subject. The Settings is where you get that information. I didn't realize you didn't know that.


message 23: by Mesembryanthemum (new)

Mesembryanthemum | 34 comments Oh, there's so much I don't know. I don't even know whether I once knew this and had just forgotten. LOL / sigh.

Today, I also know that I can't count. The number of lines that I reported in message 21 is wrong: That entry has 53 lines (not 57).

More importantly, I noticed today that the number of lines for a book entry is different if there are any blank entries, or if there are multiple read dates.

So I've given up the idea of writing a Linux-based parser for this information. It could be done, but it would take more time (for me, at least) than just looking at the shelf on GR. Instead, I'll keep chipping away at adding ASINs (and multiple read dates) to my Private Notes.


back to top