Ideal method to truncate a string with ellipsis
I'm sure all of us have seen ellipsis' on Facebook statuses (or elsewhere), and clicked "Show more" and there are only another 2 characters or so. I'd guess this is because of lazy programming, because surely there is an ideal method.
Mine counts slim characters [iIl1]
as "half characters", but this doesn't get around ellipsis' looking silly when they hide barely any characters.
Is there an ideal method? Here is mine:
/**
* Return a string with a maximum length of <code>length</code> characters.
* If there are more than <code>length</code> characters, then string ends with an ellipsis ("...").
*
* @param text
* @param length
* @return
*/
public stat开发者_开发技巧ic String ellipsis(final String text, int length)
{
// The letters [iIl1] are slim enough to only count as half a character.
length += Math.ceil(text.replaceAll("[^iIl]", "").length() / 2.0d);
if (text.length() > length)
{
return text.substring(0, length - 3) + "...";
}
return text;
}
Language doesn't really matter, but tagged as Java because that's what I'm mostly interested in seeing.
I like the idea of letting "thin" characters count as half a character. Simple and a good approximation.
The main issue with most ellipsizings however, are (imho) that they chop of words in the middle. Here is a solution taking word-boundaries into account (but does not dive into pixel-math and the Swing-API).
private final static String NON_THIN = "[^iIl1\\.,']";
private static int textWidth(String str) {
return (int) (str.length() - str.replaceAll(NON_THIN, "").length() / 2);
}
public static String ellipsize(String text, int max) {
if (textWidth(text) <= max)
return text;
// Start by chopping off at the word before max
// This is an over-approximation due to thin-characters...
int end = text.lastIndexOf(' ', max - 3);
// Just one long word. Chop it off.
if (end == -1)
return text.substring(0, max-3) + "...";
// Step forward as long as textWidth allows.
int newEnd = end;
do {
end = newEnd;
newEnd = text.indexOf(' ', end + 1);
// No more spaces.
if (newEnd == -1)
newEnd = text.length();
} while (textWidth(text.substring(0, newEnd) + "...") < max);
return text.substring(0, end) + "...";
}
A test of the algorithm looks like this:
I'm shocked no one mentioned Commons Lang StringUtils#abbreviate().
Update: yes it doesn't take the slim characters into account but I don't agree with that considering everyone has different screens and fonts setup and a large portion of the people that land here on this page are probably looking for a maintained library like the above.
It seems like you might get more accurate geometry from the Java graphics context's FontMetrics
.
Addendum: In approaching this problem, it may help to distinguish between the model and view. The model is a String
, a finite sequence of UTF-16 code points, while the view is a series of glyphs, rendered in some font on some device.
In the particular case of Java, one can use SwingUtilities.layoutCompoundLabel()
to effect the translation. The example below intercepts the layout call in BasicLabelUI
to demonstrate the effect. It may be possible to use the utility method in other contexts, but the appropriate FontMetrics
would have to be be determined empirically.
import java.awt.Color;
import java.awt.EventQueue;
import java.awt.Font;
import java.awt.FontMetrics;
import java.awt.GridLayout;
import java.awt.Rectangle;
import java.awt.event.ComponentAdapter;
import java.awt.event.ComponentEvent;
import javax.swing.BorderFactory;
import javax.swing.Icon;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JPanel;
import javax.swing.border.EmptyBorder;
import javax.swing.border.LineBorder;
import javax.swing.plaf.basic.BasicLabelUI;
/** @see http://stackoverflow.com/questions/3597550 */
public class LayoutTest extends JPanel {
private static final String text =
"A damsel with a dulcimer in a vision once I saw.";
private final JLabel sizeLabel = new JLabel();
private final JLabel textLabel = new JLabel(text);
private final MyLabelUI myUI = new MyLabelUI();
public LayoutTest() {
super(new GridLayout(0, 1));
this.setBorder(BorderFactory.createCompoundBorder(
new LineBorder(Color.blue), new EmptyBorder(5, 5, 5, 5)));
textLabel.setUI(myUI);
textLabel.setFont(new Font("Serif", Font.ITALIC, 24));
this.add(sizeLabel);
this.add(textLabel);
this.addComponentListener(new ComponentAdapter() {
@Override
public void componentResized(ComponentEvent e) {
sizeLabel.setText(
"Before: " + myUI.before + " after: " + myUI.after);
}
});
}
private static class MyLabelUI extends BasicLabelUI {
int before, after;
@Override
protected String layoutCL(
JLabel label, FontMetrics fontMetrics, String text, Icon icon,
Rectangle viewR, Rectangle iconR, Rectangle textR) {
before = text.length();
String s = super.layoutCL(
label, fontMetrics, text, icon, viewR, iconR, textR);
after = s.length();
System.out.println(s);
return s;
}
}
private void display() {
JFrame f = new JFrame("LayoutTest");
f.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
f.add(this);
f.pack();
f.setLocationRelativeTo(null);
f.setVisible(true);
}
public static void main(String[] args) {
EventQueue.invokeLater(new Runnable() {
@Override
public void run() {
new LayoutTest().display();
}
});
}
}
If you're talking about a web site - ie outputting HTML/JS/CSS, you can throw away all these solutions because there is a pure CSS solution.
text-overflow:ellipsis;
It's not quite as simple as just adding that style to your CSS, because it interracts with other CSS; eg it requires that the element has overflow:hidden; and if you want your text on a single line, white-space:nowrap;
is good too.
I have a stylesheet that looks like this:
.myelement {
word-wrap:normal;
white-space:nowrap;
overflow:hidden;
-o-text-overflow:ellipsis;
text-overflow:ellipsis;
width: 120px;
}
You can even have a "read more" button that simply runs a javascript function to change the styles, and bingo, the box will re-size and the full text will be visible. (in my case though, I tend to use the html title attribute for the full text, unless it's likely to get very long)
Hope that helps. It's a much simpler solution that trying to mess calculate the text size and truncate it, and all that. (of course, if you're writing a non-web-based app, you may still need to do that)
There is one down-side to this solution: Firefox doesn't support the ellipsis style. Annoying, but I don't think critical -- It does still truncate the text correctly, as that is dealt with by by overflow:hidden, it just doesn't display the ellipsis. It does work in all the other browsers (including IE, all the way back to IE5.5!), so it's a bit annoying that Firefox doesn't do it yet. Hopefully a new version of Firefox will solve this issue soon.
[EDIT]
People are still voting on this answer, so I should edit it to note that Firefox does now support the ellipsis style. The feature was added in Firefox 7. If you're using an earlier version (FF3.6 and FF4 still have some users) then you're out of luck, but most FF users are now okay. There's a lot more detail about this here: text-overflow:ellipsis in Firefox 4? (and FF5)
For me this would be ideal -
public static String ellipsis(final String text, int length)
{
return text.substring(0, length - 3) + "...";
}
I would not worry about the size of every character unless I really know where and in what font it is going to be displayed. Many fonts are fixed width fonts where every character has same dimension.
Even if its a variable width font, and if you count 'i', 'l' to take half the width, then why not count 'w' 'm' to take double the width? A mix of such characters in a string will generally average out the effect of their size, and I would prefer ignoring such details. Choosing the value of 'length' wisely would matter the most.
Using Guava's com.google.common.base.Ascii.truncate(CharSequence, int, String) method:
Ascii.truncate("foobar", 7, "..."); // returns "foobar"
Ascii.truncate("foobar", 5, "..."); // returns "fo..."
How about this (to get a string of 50 chars):
text.replaceAll("(?<=^.{47}).*$", "...");
public static String getTruncated(String str, int maxSize){
int limit = maxSize - 3;
return (str.length() > maxSize) ? str.substring(0, limit) + "..." : str;
}
If you're worried about the ellipsis only hiding a very small number of characters, why not just check for that condition?
public static String ellipsis(final String text, int length)
{
// The letters [iIl1] are slim enough to only count as half a character.
length += Math.ceil(text.replaceAll("[^iIl]", "").length() / 2.0d);
if (text.length() > length + 20)
{
return text.substring(0, length - 3) + "...";
}
return text;
}
I'd go with something similar to the standard model that you have. I wouldn't bother with the character widths thing - as @Gopi said it is probably goign to all balance out in the end. What I'd do that is new is have another paramter called something like "minNumberOfhiddenCharacters" (maybe a bit less verbose). Then when doign the ellipsis check I'd do something like:
if (text.length() > length+minNumberOfhiddenCharacters)
{
return text.substring(0, length - 3) + "...";
}
What this will mean is that if your text length is 35, your "length" is 30 and your min number of characters to hide is 10 then you would get your string in full. If your min number of character to hide was 3 then you would get the ellipsis instead of those three characters.
The main thing to be aware of is that I've subverted the meaning of "length" so that it is no longer a maximum length. The length of the outputted string can now be anything from 30 characters (when the text length is >40) to 40 characters (when the text length is 40 characters long). Effectively our max length becomes length+minNumberOfhiddenCharacters. The string could of course be shorter than 30 characters when the original string is less than 30 but this is a boring case that we should ignore.
If you want length to be a hard and fast maximum then you'd want something more like:
if (text.length() > length)
{
if (text.length() - length < minNumberOfhiddenCharacters-3)
{
return text.substring(0, text.length() - minNumberOfhiddenCharacters) + "...";
}
else
{
return text.substring(0, length - 3) + "...";
}
}
So in this example if text.length() is 37, length is 30 and minNumberOfhiddenCharacters = 10 then we'll go into the second part of the inner if and get 27 characters + ... to make 30. This is actually the same as if we'd gone into the first part of the loop (which is a sign we have our boundary conditions right). If the text length was 36 we'd get 26 characters + the ellipsis giving us 29 characters with 10 hidden.
I was debating whether rearranging some of the comparison logic would make it more intuitive but in the end decided to leave it as it is. You might find that text.length() - minNumberOfhiddenCharacters < length-3
makes it more obvious what you are doing though.
In my eyes, you can't get good results without pixel math.
Thus, Java is probably the wrong end to fix this problem when you are in a web application context (like facebook).
I'd go for javascript. Since Javascript is not my primary field of interest, I can't really judge if this is a good solution, but it might give you a pointer.
Most of this solutions don't take font metrics into account, here is a very simple but working solution for java swing that i have used for years now.
private String ellipsisText(String text, FontMetrics metrics, Graphics2D g2, int targetWidth) {
String shortText = text;
int activeIndex = text.length() - 1;
Rectangle2D textBounds = metrics.getStringBounds(shortText, g2);
while (textBounds.getWidth() > targetWidth) {
shortText = text.substring(0, activeIndex--);
textBounds = metrics.getStringBounds(shortText + "...", g2);
}
return activeIndex != text.length() - 1 ? shortText + "..." : text;
}
For simple cases, I have used String.format for this.
Here I abbreviate to max 10 chars and add ellipses:
String abbreviate(String longString) {
return String.format("%.10s...", longString);
}
Little known fact the "precision" numbers in the format pattern is used for truncation in strings.
Add your own length-check, of course, if you want to make ellipses conditional. (I was shortening a JWT for logging, so I know it's going to be longer)
As a bonus, if the String is already shorter than the precision, there is no padding, it simply leaves it as is.
> System.out.println(abbreviate("This is a very long string"));
> System.out.println(abbreviate("Shorty"));
This is a ...
Shorty...
You can also simply implement like this:
mb_strimwidth($string, 0, 120, '...')
Thanks.
精彩评论