IMHO the core of this question is in these few words:
Unfortunately, when you’re dealing with extremely large lists with items of varying sizes, this isn’t performant. While a cache may be leveraged, even that doesn’t work out so well when you need to know the total height (height of all items combined) at the very beginning.
This strongly contrasts with JavaScript’s nature and philosophy: joining “extremely large lists” and “at the very beginning” are some things which don’t work in JavaScript.
Probably you can achieve better results with less effort if you focus on what makes you say “at the very beginning” rather than seeking the actual answer to this question. Regardless of how performant the solution you can find is, when the “extremely large lists” continue to grow, your solution will unavoidably cause a UI block.
This is only my two cents.
Solution 2 :
I found Measure text algorithm which is to approximate the width of strings without touching the DOM.
I modified it a little to calculate the number of lines (where you are stuck).
You can calculate the number of lines like below:
/**
* @param text : <string> - The text to be rendered.
* @param containerWidth : <number> - Width of the container where dom will be rendered.
* @param fontSize : <number> - Font size of DOM text
**/
function calculateLines(text, containerWidth, fontSize = 14) {
let lines = 1; // Initiating number of lines with 1
// widths & avg value based on `Helvetica` font.
const widths = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.278125,0.278125,0.35625,0.55625,0.55625,0.890625,0.6671875,0.1921875,0.334375,0.334375,0.390625,0.584375,0.278125,0.334375,0.278125,0.303125,0.55625,0.55625,0.55625,0.55625,0.55625,0.55625,0.55625,0.55625,0.55625,0.55625,0.278125,0.278125,0.5859375,0.584375,0.5859375,0.55625,1.015625,0.6671875,0.6671875,0.7234375,0.7234375,0.6671875,0.6109375,0.778125,0.7234375,0.278125,0.5,0.6671875,0.55625,0.834375,0.7234375,0.778125,0.6671875,0.778125,0.7234375,0.6671875,0.6109375,0.7234375,0.6671875,0.9453125,0.6671875,0.6671875,0.6109375,0.278125,0.35625,0.278125,0.478125,0.55625,0.334375,0.55625,0.55625,0.5,0.55625,0.55625,0.278125,0.55625,0.55625,0.2234375,0.2421875,0.5,0.2234375,0.834375,0.55625,0.55625,0.55625,0.55625,0.334375,0.5,0.278125,0.55625,0.5,0.7234375,0.5,0.5,0.5,0.35625,0.2609375,0.3546875,0.590625]
const avg = 0.5293256578947368
text.split('')
.map(c => c.charCodeAt(0) < widths.length ? widths[c.charCodeAt(0)] : avg)
.reduce((cur, acc) => {
if((acc + cur) * fontSize > containerWidth) {
lines ++;
cur = acc;
}
return acc + cur;
});
return lines;
}
Note
I used Helvetica as font-family, you can get the value of widths & avg from Measure text according to font-family you
have.
Problem :
I’m leveraging a virtualized list (react-virtualized) where the heights of my list items are required and could vary greatly. Due to large variations, any height estimation I give the library yields a poor experience.
The usual method for height calculation goes something like this:
Unfortunately, when you’re dealing with extremely large lists with items of varying sizes, this isn’t performant. While a cache may be leveraged, even that doesn’t work out so well when you need to know the total height (height of all items combined) at the very beginning.
A second solution often leveraged is through HTML canvas’ measureText. The performance is akin to the above DOM manipulation.
In my case, I know the following:
Container Width
Font
Font size
All padding
All margins
Any and all other styling like line-height
What I’m looking for is a mathematical solution that can compute the height (or an extremely close estimate) such that I don’t have to rely on any DOM manipulation and I can get the height whenever I please.
I imagine it goes something like this:
const measureText = (text, options) => {
const { width, font, fontSize, padding, margins, borders, lineHeight } = options;
// Assume this magical function exists
// This all depends on width, stying and font information
const numberOfLines = calculateLines(text, options);
const contentHeight = numberOfLines * lineHeight;
const borderHeight = borders.width * 2 // (this is all pseudo-code... but somehow get the pixel thickness.
const marginsHeight = margins.top + margins.bottom
const paddingHeight = padding.top + padding.bottom
return marginsHeight + paddingHeight + borderHeight + contentHeight;
}
In the above, we’re missing the calculateLines function, which seems like the brunt of the work. How would one move forward on that front? Would I need to do some pre-processing for figuring out character widths? Since I know the font I’m using, this shouldn’t be too big an issue, right?
Do browser concerns exist? How might the calculation vary on each browser?
Are there any other parameters to consider? For example, if the user has some system setting that enlarges text for them (accessibility), does the browser tell me this through any usable data?
I understand rendering to the DOM is the simplest approach, but I’m willing to put the effort into a formulaic solution even if that means every time I change margins, etc. I need to ensure the inputs to the function are updated.
Update 2: Through the use of monospaced typefaces, the width calculation becomes even more simplified as you only need to measure the width of one character. Surprisingly, there are some very nice and popular fonts like Menlo and Monaco on the list.
Big Update 3: It was quite an all-nighter, but through inspiration via the SVG method in update 1, I came up with something that has been working fantastically to calculate the number of lines. Unfortunately, I’ve seen that 1% of the time it is off by 1 line. The following is roughly the code:
const wordWidths = {} as { [word: string]: number };
const xmlsx = const xmlsn = "http://www.w3.org/2000/svg";
const svg = document.createElementNS(xmlsn, "svg");
const text = document.createElementNS(xmlsn, "text");
const spaceText = document.createElementNS(xmlsn, "text");
svg.appendChild(text);
svg.appendChild(spaceText);
document.body.appendChild(svg);
// Convert style objects like { backgroundColor: "red" } to "background-color: red;" strings for HTML
const styleString = (object: any) => {
return Object.keys(object).reduce((prev, curr) => {
return `${(prev += curr
.split(/(?=[A-Z])/)
.join("-")
.toLowerCase())}:${object[curr]};`;
}, "");
};
const getWordWidth = (character: string, style: any) => {
const cachedWidth = wordWidths[character];
if (cachedWidth) return cachedWidth;
let width;
// edge case: a naked space (charCode 32) takes up no space, so we need
// to handle it differently. Wrap it between two letters, then subtract those
// two letters from the total width.
if (character === " ") {
const textNode = document.createTextNode("t t");
spaceText.appendChild(textNode);
spaceText.setAttribute("style", styleString(style));
width = spaceText.getBoundingClientRect().width;
width -= 2 * getWordWidth("t", style);
wordWidths[" "] = width;
spaceText.removeChild(textNode);
} else {
const textNode = document.createTextNode(character);
text.appendChild(textNode);
text.setAttribute("style", styleString(style));
width = text.getBoundingClientRect().width;
wordWidths[character] = width;
text.removeChild(textNode);
}
return width;
};
const getNumberOfLines = (text: string, maxWidth: number, style: any) => {
let numberOfLines = 1;
// In my use-case, I trim all white-space and don't allow multiple spaces in a row
// It also simplifies this logic. Though, for now this logic does not handle
// new-lines
const words = text.replace(/s+/g, " ").trim().split(" ");
const spaceWidth = getWordWidth(" ", style);
let lineWidth = 0;
const wordsLength = words.length;
for (let i = 0; i < wordsLength; i++) {
const wordWidth = getWordWidth(words[i], style);
if (lineWidth + wordWidth > maxWidth) {
/**
* If the line has no other words (lineWidth === 0),
* then this word will overflow the line indefinitely.
* Browsers will not push the text to the next line. This is intuitive.
*
* Hence, we only move to the next line if this line already has
* a word (lineWidth !== 0)
*/
if (lineWidth !== 0) {
numberOfLines += 1;
}
lineWidth = wordWidth + spaceWidth;
continue;
}
lineWidth += wordWidth + spaceWidth;
}
return numberOfLines;
};
Originally, I did this character-by-character, but due to kernings and how they affect groups of letters, going word by word is more accurate. It’s also important to note that though style is leveraged, the padding must be accounted for in the maxWidth parameter. CSS Padding won’t have any effect on the SVG text element. It handles the width-adjusting style letter-spacing decently (it’s not perfect and I’m not sure why).
As for internationalization, it seemed to work just as well as it did with english except for when I got into Chinese. I don’t know Chinese, but it seems to follow different rules for overflowing into new lines and this doesn’t account for those rules.
Unfortunately, like I said earlier, I have noticed that this is off-by-one now and then. Though this is uncommon, it is not ideal. I’m trying to figure out what is causing the tiny discrepancies.
The test data I’m working with is randomly generated and is anywhere from 4~80 lines (and I generate 100 at a time).
Update 4: I don’t think I have any negative results anymore. The change is subtle but important: instead of getNumberOfLines(text, width, styles), you need to use getNumberOfLines(text, Math.floor(width), styles) and make sure Math.floor(width) is the width being used in the DOM as well. Browsers are inconsistent and handle decimal pixels differently. If we force the width to be an integer, then we don’t have to worry about it.
Comments
Comment posted by user120242
I don’t think I’ve ever seen a decent implementation that doesn’t use a hidden DOM element. Even those are usually still “good best guess” and not perfect. Do share if anyone does find one though.
Comment posted by David
@user120242 me either. I’m currently fiddling around with my own width calculator. Will report results.
Comment posted by David
@user120242 I edited with an update. While technically it is on the DOM, I must say… the SVG method is extremely performant. Don’t even notice a blip and I’m dealing with a large data set.
Comment posted by Kaiido
What about Z̷̧̢̩̫̟͖̟͇͙̫̟͚̦̓͌̍̐̌̊̓ä̴̭̼̹̫͎͕̲͙͈͊̌̈̕̕͜ͅl̸̻̹̦̬͕͍͉͗̓̌͐̄̃̎͂̈̄̚͘͝͠g̴̹̽̆͌͋͗̏̌̀͆̆̕ŏ̸̱͎͕̥̹͔̱̺̗̽̅̂̀̆͐̀̚͜ͅ?
Comment posted by David
@Kaiido I don’t think anything would handle that overflow well – testing on Chrome, it doesn’t acknowledge that text or accommodate for it’s height in any way.
Comment posted by David
If you check out my “Update” and “Big Update 3”, you’ll notice that I actually link to the algorithm you mention. I also got some work started on a getNumberOfLines function. I’m not sure if my function works in all cases (but lately I’ve seen it working consistently for English). More testing has to be done with different style properties passed in. Further, my method doesn’t handle Chinese well at all (though I don’t know the Chinese rules on overflowing). Also, this is a hail mary, but my solution doesn’t handle text where you might want some words bolded (I.e. different styles).
Comment posted by David
As for your algorithm, you should check mine out. Yours has a bug where any overflow of the width will constitute a new line, but this isn’t actually true. Sometimes, text will overflow, but the browser won’t push it onto a new line.
Comment posted by harish kumar
I understood your concern here. Although this algorithm is not perfect, but it can give you near to the correct value. I’ll try to figure out something else.
Comment posted by David
What styling information did you provide? I’ll check it out later.
Comment posted by harish kumar
Found your algorithm working far better than above one.