1 /*
2 * Inkscape::Text::Layout - text layout engine
3 *
4 * Authors:
5 * Richard Hughes <cyreve@users.sf.net>
6 *
7 * Copyright (C) 2005 Richard Hughes
8 *
9 * Released under GNU GPL, read the file 'COPYING' for more information
10 */
11 #ifndef __LAYOUT_TNG_H__
12 #define __LAYOUT_TNG_H__
14 #include "libnr/nr-rect.h"
15 #include "libnr/nr-matrix.h"
16 #include "libnr/nr-matrix-ops.h"
17 #include "libnr/nr-rotate-ops.h"
18 #include <glibmm/ustring.h>
19 #include <pango/pango-break.h>
20 #include <vector>
22 class SPStyle;
23 class Shape;
24 class NRArenaGroup;
25 class SPPrintContext;
26 class SVGLength;
27 class Path;
28 class SPCurve;
29 class font_instance;
31 namespace Inkscape {
32 namespace Text {
34 /** \brief Generates the layout for either wrapped or non-wrapped text and stores the result
36 Use this class for all your text output needs. It takes text with formatting
37 markup as input and turns that into the glyphs and their necessary positions.
38 It stores the glyphs internally, but maintains enough information to both
39 retrieve your own rendering information if you wish and to perform visual
40 text editing where the output refers back to where it came from.
42 Usage:
43 -# Construct
44 -# Set the text using appendText() and appendControlCode()
45 -# If you want text wrapping, call appendWrapShape() a few times
46 -# Call calculateFlow()
47 -# You can go several directions from here, but the most interesting
48 things start with creating a Layout::iterator with begin() or end().
50 Terminology, in descending order of size:
51 - Flow: Not often used, but when it is it means all the text
52 - Shape: A Shape object which is used to represent one of the regions inside
53 which to flow the text. Can overlap with...
54 - Paragraph: Err...A paragraph. Contains one or more...
55 - Line: An entire horizontal line with a common baseline. Contains one or
56 more...
57 - Chunk: You only get more than one of these when a shape is sufficiently
58 complex that the text has to flow either side of some obstruction in
59 the middle. A chunk is the base unit for wrapping. Contains one or more...
60 - Span: A convenient subset of a chunk with the same font, style,
61 directionality, block progression and input stream. Fill and outline
62 need not be constant because that's a later rendering stage.
63 - This is where it gets weird because a span will contain one or more
64 elements of both of the following, which can overlap with each other in
65 any way:
66 - Character: a single Unicode codepoint from an input stream. Many arabic
67 characters contain multiple glyphs
68 - Glyph: a rendering primitive for font engines. A ligature glyph will
69 represent multiple characters.
71 Other terminology:
72 - Input stream: An object representing a single call to appendText() or
73 appendControlCode().
74 - Control code: Metadata in the text stream to signify items that occupy
75 real space (unlike style changes) but don't belong in the text string.
76 Paragraph breaks are in this category. See Layout::TextControlCode.
77 - SVG1.1: The W3C Recommendation "Scalable Vector Graphics (SVG) 1.1"
78 http://www.w3.org/TR/SVG11/
79 - 'left', 'down', etc: These terms are generally used to mean what they
80 mean in left-to-right, top-to-bottom text but rotated or reflected for
81 the current directionality. Thus, the 'width' of a ttb line is actually
82 its height, and the (internally stored) y coordinate of a glyph is
83 actually its x coordinate. Confusing to the reader but much simpler in
84 the code. All public methods use real x and y.
86 Comments:
87 - There's a strong emphasis on international support in this class, but
88 that's primarily because once you can display all the insane things
89 required by various languages, simple things like styling text are
90 almost trivial.
91 - There are a few places (appendText() is one) where pointers are held to
92 caller-owned objects and used for quite a long time. This is messy but
93 is safe for our usage scenario and in many cases the cost of copying the
94 objects is quite high.
95 - "Why isn't foo here?": Ask yourself if it's possible to implement foo
96 externally using iterators. However this may not mean that it doesn't
97 belong as a member, though.
98 - I've used floats rather than doubles to store relative distances in some
99 places (internal only) where it would save significant amounts of memory.
100 The SVG spec allows you to do this as long as intermediate calculations
101 are done double. Very very long lines might not finish precisely where
102 you want, but that's to be expected with any typesetting. Also,
103 SVGLength only uses floats.
104 - If you look at the six arrays for holding the output data you'll realise
105 that there's no O(1) way to drill down from a paragraph to find its
106 starting glyph. This was a conscious decision to reduce complexity and
107 to save memory. Drilling down isn't actually that slow because a binary
108 chop will work nicely. Add this to the realisation that most of the
109 times you do this will be in response to user actions and hence you only
110 need to be faster than the user and I think the design makes sense.
111 - There are a massive number of functions acting on Layout::iterator. A
112 large number are trivial and will be inline, but is it really necessary
113 to have all these, especially when some can be implemented by the caller
114 using the others?
115 - The separation of methods between Layout and Layout::iterator is a
116 bit arbitrary, because many methods could go in either. I've used the STL
117 model where the iterator itself can only move around; the base class is
118 required to do anything interesting.
119 - I use Pango internally, not Pangomm. The reason for this is lots of
120 Pangomm methods take Glib::ustrings as input and then output byte offsets
121 within the strings. There's simply no way to use byte offsets with
122 ustrings without some very entertaining reinterpret_cast<>s. The Pangomm
123 docs seem to be lacking quite a lot of things mentioned in the Pango
124 docs, too.
125 */
126 class Layout {
127 public:
128 class iterator;
129 friend class iterator;
130 class Calculator;
131 friend class Calculator;
132 class ScanlineMaker;
133 class InfiniteScanlineMaker;
134 class ShapeScanlineMaker;
136 Layout();
137 virtual ~Layout();
139 /** Used to specify any particular text direction required. Used for
140 both the 'direction' and 'block-progression' CSS attributes. */
141 enum Direction {LEFT_TO_RIGHT, RIGHT_TO_LEFT, TOP_TO_BOTTOM, BOTTOM_TO_TOP};
143 /** Display alignment for shapes. See appendWrapShape(). */
144 enum DisplayAlign {DISPLAY_ALIGN_BEFORE, DISPLAY_ALIGN_CENTER, DISPLAY_ALIGN_AFTER};
146 /** The optional attributes which can be applied to a SVG text or
147 related tag. See appendText(). See SVG1.1 section 10.4 for the
148 definitions of all these members. See sp_svg_length_list_read() for
149 the standard way to make these vectors. It is the responsibility of
150 the caller to deal with the inheritance of these values using its
151 knowledge of the parse tree. */
152 struct OptionalTextTagAttrs {
153 std::vector<SVGLength> x;
154 std::vector<SVGLength> y;
155 std::vector<SVGLength> dx;
156 std::vector<SVGLength> dy;
157 std::vector<SVGLength> rotate;
158 };
160 /** Control codes which can be embedded in the text to be flowed. See
161 appendControlCode(). */
162 enum TextControlCode {
163 PARAGRAPH_BREAK, /// forces the flow to move on to the next line
164 SHAPE_BREAK, /// forces the flow to ignore the remainder of the current shape (from #flow_inside_shapes) and continue at the top of the one after.
165 ARBITRARY_GAP /// inserts an arbitrarily-sized hole in the flow in line with the current text.
166 };
168 /** For expressing paragraph alignment. These values are rotated in the
169 case of vertical text, but are not dependent on whether the paragraph is
170 rtl or ltr, thus LEFT is always either left or top. */
171 enum Alignment {LEFT, CENTER, RIGHT, FULL};
173 /** The CSS spec allows line-height:normal to be whatever the user agent
174 thinks will look good. This is our value, as a multiple of font-size. */
175 static const double LINE_HEIGHT_NORMAL;
177 // ************************** describing the stuff to flow *************************
179 /** \name Input
180 Methods for describing the text you want to flow, its style, and the
181 shapes to flow in to.
182 */
183 //@{
185 /** Empties everything stored in this class and resets it to its
186 original state, like when it was created. All iterators on this
187 object will be invalidated (but can be revalidated using
188 validateIterator(). */
189 void clear();
191 /** Queries whether any calls have been made to appendText() or
192 appendControlCode() since the object was last cleared. */
193 bool inputExists() const
194 {return !_input_stream.empty();}
196 /** adds a new piece of text to the end of the current list of text to
197 be processed. This method can only add text of a consistent style.
198 To add lots of different styles, call it lots of times.
199 \param text The text. \b Note: only a \em pointer is stored. Do not
200 mess with the text until after you have called
201 calculateFlow().
202 \param style The font style. Layout will hold a reference to this
203 object for the duration of its ownership, ie until you
204 call clear() or the class is destroyed. Must not be NULL.
205 \param source_cookie This pointer is treated as opaque by Layout
206 but will be passed through the flowing process intact so
207 that callers can use it to refer to the original object
208 that generated a particular glyph. See Layout::iterator.
209 Implementation detail: currently all callers put an
210 SPString in here.
211 \param optional_attributes A structure containing additional options
212 for this text. See OptionalTextTagAttrs. The values are
213 copied to internal storage before this method returns.
214 \param optional_attributes_offset It is convenient for callers to be
215 able to use the same \a optional_attributes structure for
216 several sequential text fields, in which case the vectors
217 will need to be offset. This parameter causes the <i>n</i>th
218 element of all the vectors to be read as if it were the
219 first.
220 \param text_begin Used for selecting only a substring of \a text
221 to process.
222 \param text_end Used for selecting only a substring of \a text
223 to process.
224 */
225 void appendText(Glib::ustring const &text, SPStyle *style, void *source_cookie, OptionalTextTagAttrs const *optional_attributes, unsigned optional_attributes_offset, Glib::ustring::const_iterator text_begin, Glib::ustring::const_iterator text_end);
226 inline void appendText(Glib::ustring const &text, SPStyle *style, void *source_cookie, OptionalTextTagAttrs const *optional_attributes = NULL, unsigned optional_attributes_offset = 0)
227 {appendText(text, style, source_cookie, optional_attributes, optional_attributes_offset, text.begin(), text.end());}
229 /** Control codes are metadata in the text stream to signify items
230 that occupy real space (unlike style changes) but don't belong in the
231 text string. See TextControlCode for the types available.
233 A control code \em cannot be the first item in the input stream. Use
234 appendText() with an empty string to set up the paragraph properties.
235 \param code A member of the TextFlowControlCode enumeration.
236 \param width The width in pixels that this item occupies.
237 \param ascent The number of pixels above the text baseline that this
238 control code occupies.
239 \param descent The number of pixels below the text baseline that this
240 control code occupies.
241 \param source_cookie This pointer is treated as opaque by Layout
242 but will be passed through the flowing process intact so
243 that callers can use it to refer to the original object
244 that generated a particular area. See Layout::iterator.
245 Implementation detail: currently all callers put an
246 SPObject in here.
247 Note that for some control codes (eg tab) the values of the \a width,
248 \a ascender and \a descender are implied by the surrounding text (and
249 in the case of tabs, the values set in tab_stops) so the values you pass
250 here are ignored.
251 */
252 void appendControlCode(TextControlCode code, void *source_cookie, double width = 0.0, double ascent = 0.0, double descent = 0.0);
254 /** Stores another shape inside which to flow the text. If this method
255 is never called then no automatic wrapping is done and lines will
256 continue to infinity if necessary. Text can be flowed inside multiple
257 shapes in sequence, like with frames in a DTP package. If the text flows
258 past the end of the last shape all remaining text is ignored.
260 \param shape The Shape to use next in the flow. The storage for this
261 is managed by the caller, and need only be valid for
262 the duration of the call to calculateFlow().
263 \param display_align The vertical alignment of the text within this
264 shape. See XSL1.0 section 7.13.4. The behaviour of
265 settings other than DISPLAY_ALIGN_BEFORE when using
266 non-rectangular shapes is undefined.
267 */
268 void appendWrapShape(Shape const *shape, DisplayAlign display_align = DISPLAY_ALIGN_BEFORE);
270 //@}
272 // ************************** doing the actual flowing *************************
274 /** \name Processing
275 The method to do the actual work of converting text into glyphs.
276 */
277 //@{
279 /** Takes all the stuff you set with the members above here and creates
280 a load of glyphs for use with the members below here. All iterators on
281 this object will be invalidated (but can be fixed with validateIterator().
282 The implementation just creates a new Layout::Calculator and calls its
283 Calculator::Calculate() method, so if you want more details on the
284 internals, go there.
285 \return false on failure.
286 */
287 bool calculateFlow();
289 //@}
291 // ************************** operating on the output glyphs *************************
293 /** \name Output
294 Methods for reading and interpreting the output glyphs. See also
295 Layout::iterator.
296 */
297 //@{
299 /** Returns true if there are some glyphs in this object, ie whether
300 computeFlow() has been called on a non-empty input since the object was
301 created or the last call to clear(). */
302 inline bool outputExists() const
303 {return !_characters.empty();}
305 /** Adds all the output glyphs to \a in_arena using the given \a paintbox.
306 \param in_arena The arena to add the glyphs group to
307 \param paintbox The current rendering tile
308 */
309 void show(NRArenaGroup *in_arena, NRRect const *paintbox) const;
311 /** Calculates the smallest rectangle completely enclosing all the
312 glyphs.
313 \param bounding_box Where to store the box
314 \param transform The transform to be applied to the entire object
315 prior to calculating its bounds.
316 */
317 void getBoundingBox(NRRect *bounding_box, NR::Matrix const &transform, int start = -1, int length = -1) const;
319 /** Sends all the glyphs to the given print context.
320 \param ctx I have
321 \param pbox no idea
322 \param dbox what these
323 \param bbox parameters
324 \param ctm do yet
325 */
326 void print(SPPrintContext *ctx, NRRect const *pbox, NRRect const *dbox, NRRect const *bbox, NRMatrix const &ctm) const;
328 /** debug and unit test method. Creates a textual representation of the
329 contents of this object. The output is designed to be both human-readable
330 and comprehensible when diffed with a known-good dump. */
331 Glib::ustring dumpAsText() const;
333 /** Moves all the glyphs in the structure so that the baseline of all
334 the characters sits neatly along the path specified. If the text has
335 more than one line the results are undefined. The 'align' means to
336 use the SVG align method as documented in SVG1.1 section 10.13.2.
337 NB: njh has suggested that it would be cool if we could flow from
338 shape to path and back again. This is possible, so this method will be
339 removed at some point.
340 A pointer to \a path is retained by the class for use by the cursor
341 positioning functions. */
342 void fitToPathAlign(SVGLength const &startOffset, Path const &path);
344 /** Convert the specified range of characters into their bezier
345 outlines.
346 */
347 SPCurve* convertToCurves(iterator const &from_glyph, iterator const &to_glyph) const;
348 inline SPCurve* convertToCurves() const;
350 /** Apply the given transform to all the output presently stored in
351 this object. This only transforms the glyph positions, The glyphs
352 themselves will not be transformed. */
353 void transform(NR::Matrix const &transform);
355 //@}
357 // **********
359 /** \name Output (Iterators)
360 Methods for operating with the Layout::iterator class. The method
361 names ending with 'Index' return 0-based offsets of the number of
362 items since the beginning of the flow.
363 */
364 //@{
366 /** Returns an iterator pointing at the first glyph of the flowed output.
367 The first glyph is also the first character, line, paragraph, etc. */
368 inline iterator begin() const;
370 /** Returns an iterator pointing just past the end of the last glyph,
371 which is also just past the end of the last chunk, span, etc, etc. */
372 inline iterator end() const;
374 /** Returns an iterator pointing at the given character index. This
375 index should be related to the result from a prior call to
376 iteratorToCharIndex(). */
377 inline iterator charIndexToIterator(int char_index) const;
379 /** Returns the character index from the start of the flow represented
380 by the given iterator. This number isn't very useful, except for when
381 editing text it will stay valid across calls to computeFlow() and will
382 change in predictable ways when characters are added and removed. It's
383 also useful when transitioning old code. */
384 inline int iteratorToCharIndex(iterator const &it) const;
386 /** Checks the validity of the given iterator over the current layout.
387 If it points to a position out of the bounds for this layout it will
388 be corrected to the nearest valid position. If you pass an iterator
389 belonging to a different layout it will be converted to one for this
390 layout. */
391 inline void validateIterator(iterator *it) const;
393 /** Returns an iterator pointing to the cursor position for a mouse
394 click at the given coordinates. */
395 iterator getNearestCursorPositionTo(double x, double y) const;
396 inline iterator getNearestCursorPositionTo(NR::Point &point) const;
398 /** Returns an iterator pointing to the letter whose bounding box contains
399 the given coordinates. end() if the point is not over any letter. The
400 iterator will \em not point at the specific glyph within the character. */
401 iterator getLetterAt(double x, double y) const;
402 inline iterator getLetterAt(NR::Point &point) const;
404 /** Returns an iterator pointing to the character in the output which
405 was created from the given input. If the character at the given byte
406 offset was removed (soft hyphens, for example) the next character after
407 it is returned. If no input was added with the given cookie, end() is
408 returned. If more than one input has the same cookie, the first will
409 be used regardless of the value of \a text_iterator. If
410 \a text_iterator is out of bounds, the first or last character belonging
411 to the given input will be returned accordingly. */
412 iterator sourceToIterator(void *source_cookie, Glib::ustring::const_iterator text_iterator) const;
414 /** Returns an iterator pointing to the first character in the output
415 which was created from the given source. If \a source_cookie is invalid,
416 end() is returned. If more than one input has the same cookie, the
417 first one will be used. */
418 iterator sourceToIterator(void *source_cookie) const;
420 // many functions acting on iterators, most of which are obvious
421 // also most of them don't check that \a it != end(). Be careful.
423 /** Returns the bounding box of the given glyph, and its rotation.
424 The centre of rotation is the horizontal centre of the box at the
425 text baseline. */
426 NR::Rect glyphBoundingBox(iterator const &it, double *rotation) const;
428 /** Returns the zero-based line number of the character pointed to by
429 \a it. */
430 inline unsigned lineIndex(iterator const &it) const;
432 /** Returns the zero-based number of the shape which contains the
433 character pointed to by \a it. */
434 inline unsigned shapeIndex(iterator const &it) const;
436 /** Returns true if the character at \a it is a whitespace, as defined
437 by Pango. This is not meant to be used for picking out words from the
438 output, use iterator::nextStartOfWord() and friends instead. */
439 inline bool isWhitespace(iterator const &it) const;
441 /** Returns the unicode character code of the character pointed to by
442 \a it. If \a it == end() the result is undefined. */
443 inline int characterAt(iterator const &it) const;
445 /** Discovers where the character pointed to by \a it came from, by
446 retrieving the cookie that was passed to the call to appendText() or
447 appendControlCode() which generated that output. If \a it == end()
448 then NULL is returned as the cookie. If the character was generated
449 from a call to appendText() then the optional \a text_iterator
450 parameter is set to point to the actual character, otherwise
451 \a text_iterator is unaltered. */
452 void getSourceOfCharacter(iterator const &it, void **source_cookie, Glib::ustring::iterator *text_iterator = NULL) const;
454 /** For latin text, the left side of the character, on the baseline */
455 NR::Point characterAnchorPoint(iterator const &it) const;
457 /** This is that value to apply to the x,y attributes of tspan role=line
458 elements, and hence it takes alignment into account. */
459 NR::Point chunkAnchorPoint(iterator const &it) const;
461 /** Returns the box extents (not ink extents) of the given character.
462 The centre of rotation is at the horizontal centre of the box on the
463 text baseline. */
464 NR::Rect characterBoundingBox(iterator const &it, double *rotation = NULL) const;
466 /** Basically uses characterBoundingBox() on all the characters from
467 \a start to \a end and returns the union of these boxes. The return value
468 is a list of zero or more quadrilaterals specified by a group of four
469 points for each, thus size() is always a multiple of four. */
470 std::vector<NR::Point> createSelectionShape(iterator const &it_start, iterator const &it_end, NR::Matrix const &transform) const;
472 /** Returns true if \a it points to a character which is a valid cursor
473 position, as defined by Pango. */
474 inline bool isCursorPosition(iterator const &it) const;
476 /** Gets the ideal cursor shape for a given iterator. The result is
477 undefined if \a it is not at a valid cursor position.
478 \param it The location in the output
479 \param position The pixel location of the centre of the 'bottom' of
480 the cursor.
481 \param height The height in pixels of the surrounding text
482 \param rotation The angle to draw from \a position. Radians, zero up,
483 increasing clockwise.
484 */
485 void queryCursorShape(iterator const &it, NR::Point *position, double *height, double *rotation) const;
487 /** Returns true if \a it points to a character which is a the start of
488 a word, as defined by Pango. */
489 inline bool isStartOfWord(iterator const &it) const;
491 /** Returns true if \a it points to a character which is a the end of
492 a word, as defined by Pango. */
493 inline bool isEndOfWord(iterator const &it) const;
495 /** Returns true if \a it points to a character which is a the start of
496 a sentence, as defined by Pango. */
497 inline bool isStartOfSentence(iterator const &it) const;
499 /** Returns true if \a it points to a character which is a the end of
500 a sentence, as defined by Pango. */
501 inline bool isEndOfSentence(iterator const &it) const;
503 /** Returns the zero-based number of the paragraph containing the
504 character pointed to by \a it. */
505 inline unsigned paragraphIndex(iterator const &it) const;
507 /** Returns the actual alignment used for the paragraph containing
508 the character pointed to by \a it. This means that the CSS 'start'
509 and 'end' are correctly translated into LEFT or RIGHT according to
510 the paragraph's directionality. For vertical text, LEFT is top
511 alignment and RIGHT is bottom. */
512 inline Alignment paragraphAlignment(iterator const &it) const;
514 /** Returns kerning information which could cause the current output
515 to be exactly reproduced if the letter and word spacings were zero and
516 full justification was not used. The x and y arrays are not used, but
517 they are cleared. The dx applied to the first character in a chunk
518 will always be zero. If the region between \a from and \a to crosses
519 a line break then the results may be surprising, and are undefined.
520 Trailing zeros on the returned arrays will be trimmed. */
521 void simulateLayoutUsingKerning(iterator const &from, iterator const &to, OptionalTextTagAttrs *result) const;
523 //@}
525 /// it's useful for this to be public so that ScanlineMaker can use it
526 struct LineHeight {
527 double ascent;
528 double descent;
529 double leading;
530 inline double total() const {return ascent + descent + leading;}
531 inline void setZero() {ascent = descent = leading = 0.0;}
532 inline LineHeight& operator*=(double x) {ascent *= x; descent *= x; leading *= x; return *this;}
533 void max(LineHeight const &other); /// makes this object contain the largest of all three members between this object and other
534 };
536 /// see _enum_converter()
537 struct EnumConversionItem {
538 int input, output;
539 };
541 private:
542 /** Erases all the stuff set by the owner as input, ie #_input_stream
543 and #_input_wrap_shapes. */
544 void _clearInputObjects();
546 /** Erases all the stuff output by computeFlow(). Glyphs and things. */
547 void _clearOutputObjects();
549 static const gunichar UNICODE_SOFT_HYPHEN;
551 // ******************* input flow
553 enum InputStreamItemType {TEXT_SOURCE, CONTROL_CODE};
555 class InputStreamItem {
556 public:
557 virtual ~InputStreamItem() {}
558 virtual InputStreamItemType Type() =0;
559 void *source_cookie;
560 };
562 /** Represents a text item in the input stream. See #_input_stream.
563 Most of the members are copies of the values passed to appendText(). */
564 class InputStreamTextSource : public InputStreamItem {
565 public:
566 virtual InputStreamItemType Type() {return TEXT_SOURCE;}
567 virtual ~InputStreamTextSource();
568 Glib::ustring const *text; /// owned by the caller
569 Glib::ustring::const_iterator text_begin, text_end;
570 int text_length; /// in characters, from text_start to text_end only
571 SPStyle *style;
572 /** These vectors can (often will) be shorter than the text
573 in this source, but never longer. */
574 std::vector<SVGLength> x;
575 std::vector<SVGLength> y;
576 std::vector<SVGLength> dx;
577 std::vector<SVGLength> dy;
578 std::vector<SVGLength> rotate;
580 // a few functions for some of the more complicated style accesses
581 float styleComputeFontSize() const;
582 font_instance *styleGetFontInstance() const;
583 Direction styleGetBlockProgression() const;
584 Alignment styleGetAlignment(Direction para_direction, bool try_text_align) const;
585 };
587 /** Represents a control code item in the input stream. See
588 #_input_streams. All the members are copies of the values passed to
589 appendControlCode(). */
590 class InputStreamControlCode : public InputStreamItem {
591 public:
592 virtual InputStreamItemType Type() {return CONTROL_CODE;}
593 TextControlCode code;
594 double ascent;
595 double descent;
596 double width;
597 };
599 /** This is our internal storage for all the stuff passed to the
600 appendText() and appendControlCode() functions. */
601 std::vector<InputStreamItem*> _input_stream;
603 /** The parameters to appendText() are allowed to be a little bit
604 complex. This copies them to be the right length and starting at zero.
605 We also don't want to write five bits of identical code just with
606 different variable names. */
607 static void _copyInputVector(std::vector<SVGLength> const &input_vector, unsigned input_offset, std::vector<SVGLength> *output_vector, size_t max_length);
609 /** There are a few cases where we have different sets of enums meaning
610 the same thing, eg Pango font styles vs. SPStyle font styles. These need
611 converting. */
612 static int _enum_converter(int input, EnumConversionItem const *conversion_table, unsigned conversion_table_size);
614 /** The overall block-progression of the whole flow. */
615 inline Direction _blockProgression() const
616 {return static_cast<InputStreamTextSource*>(_input_stream.front())->styleGetBlockProgression();}
618 /** so that LEFT_TO_RIGHT == RIGHT_TO_LEFT but != TOP_TO_BOTTOM */
619 static bool _directions_are_orthogonal(Direction d1, Direction d2);
621 /** If the output is empty callers still want to be able to call
622 queryCursorShape() and get a valid answer so, while #_input_wrap_shapes
623 can still be considered valid, we need to precompute the cursor shape
624 for this case. */
625 void _calculateCursorShapeForEmpty();
627 struct CursorShape {
628 NR::Point position;
629 double height;
630 double rotation;
631 } _empty_cursor_shape;
633 // ******************* input shapes
635 struct InputWrapShape {
636 Shape const *shape; /// as passed to Layout::appendWrapShape()
637 DisplayAlign display_align; /// as passed to Layout::appendWrapShape()
638 };
639 std::vector<InputWrapShape> _input_wrap_shapes;
641 // ******************* output
643 /** as passed to fitToPathAlign() */
644 Path const *_path_fitted;
646 struct Glyph;
647 struct Character;
648 struct Span;
649 struct Chunk;
650 struct Line;
651 struct Paragraph;
653 struct Glyph {
654 int glyph;
655 unsigned in_character;
656 float x; /// relative to the start of the chunk
657 float y; /// relative to the current line's baseline
658 float rotation; /// absolute, modulo any object transforms, which we don't know about
659 float width;
660 inline Span const & span(Layout const *l) const {return l->_spans[l->_characters[in_character].in_span];}
661 inline Chunk const & chunk(Layout const *l) const {return l->_chunks[l->_spans[l->_characters[in_character].in_span].in_chunk];}
662 inline Line const & line(Layout const *l) const {return l->_lines[l->_chunks[l->_spans[l->_characters[in_character].in_span].in_chunk].in_line];}
663 };
664 struct Character {
665 unsigned in_span;
666 float x; /// relative to the start of the *span* (so we can do block-progression)
667 PangoLogAttr char_attributes;
668 int in_glyph; /// will be -1 if this character has no visual representation
669 inline Span const & span(Layout const *l) const {return l->_spans[in_span];}
670 inline Chunk const & chunk(Layout const *l) const {return l->_chunks[l->_spans[in_span].in_chunk];}
671 inline Line const & line(Layout const *l) const {return l->_lines[l->_chunks[l->_spans[in_span].in_chunk].in_line];}
672 inline Paragraph const & paragraph(Layout const *l) const {return l->_paragraphs[l->_lines[l->_chunks[l->_spans[in_span].in_chunk].in_line].in_paragraph];}
673 // to get the advance width of a character, subtract the x values if it's in the middle of a span, or use span.x_end if it's at the end
674 };
675 struct Span {
676 unsigned in_chunk;
677 font_instance *font;
678 float font_size;
679 float x_start; /// relative to the start of the chunk
680 float x_end; /// relative to the start of the chunk
681 LineHeight line_height;
682 double baseline_shift; /// relative to the line's baseline
683 Direction direction; /// See CSS3 section 3.2. Either rtl or ltr
684 Direction block_progression; /// See CSS3 section 3.2. The direction in which lines go.
685 unsigned in_input_stream_item;
686 Glib::ustring::const_iterator input_stream_first_character;
687 inline Chunk const & chunk(Layout const *l) const {return l->_chunks[in_chunk];}
688 inline Line const & line(Layout const *l) const {return l->_lines[l->_chunks[in_chunk].in_line];}
689 inline Paragraph const & paragraph(Layout const *l) const {return l->_paragraphs[l->_lines[l->_chunks[in_chunk].in_line].in_paragraph];}
690 };
691 struct Chunk {
692 unsigned in_line;
693 double left_x;
694 };
695 struct Line {
696 unsigned in_paragraph;
697 double baseline_y;
698 unsigned in_shape;
699 };
700 struct Paragraph {
701 Direction base_direction; /// can be overridden by child Span objects
702 Alignment alignment;
703 };
704 std::vector<Paragraph> _paragraphs;
705 std::vector<Line> _lines;
706 std::vector<Chunk> _chunks;
707 std::vector<Span> _spans;
708 std::vector<Character> _characters;
709 std::vector<Glyph> _glyphs;
711 /** gets the overall matrix that transforms the given glyph from local
712 space to world space. */
713 void _getGlyphTransformMatrix(int glyph_index, NRMatrix *matrix) const;
715 // loads of functions to drill down the object tree, all of them
716 // annoyingly similar and all of them requiring predicate functors.
717 // I'll be buggered if I can find a way to make it work with
718 // functions or with a templated functor, so macros it is.
719 #define EMIT_PREDICATE(name, object_type, index_generator) \
720 class name { \
721 Layout const * const _flow; \
722 public: \
723 inline name(Layout const *flow) : _flow(flow) {} \
724 inline bool operator()(object_type const &object, unsigned index) \
725 {return index_generator < index;} \
726 }
727 // end of macro
728 EMIT_PREDICATE(PredicateLineToSpan, Span, _flow->_chunks[object.in_chunk].in_line);
729 EMIT_PREDICATE(PredicateLineToCharacter, Character, _flow->_chunks[_flow->_spans[object.in_span].in_chunk].in_line);
730 EMIT_PREDICATE(PredicateSpanToCharacter, Character, object.in_span);
731 EMIT_PREDICATE(PredicateSourceToCharacter, Character, _flow->_spans[object.in_span].in_input_stream_item);
733 inline unsigned _lineToSpan(unsigned line_index) const
734 {return std::lower_bound(_spans.begin(), _spans.end(), line_index, PredicateLineToSpan(this)) - _spans.begin();}
735 inline unsigned _lineToCharacter(unsigned line_index) const
736 {return std::lower_bound(_characters.begin(), _characters.end(), line_index, PredicateLineToCharacter(this)) - _characters.begin();}
737 inline unsigned _spanToCharacter(unsigned span_index) const
738 {return std::lower_bound(_characters.begin(), _characters.end(), span_index, PredicateSpanToCharacter(this)) - _characters.begin();}
739 inline unsigned _sourceToCharacter(unsigned source_index) const
740 {return std::lower_bound(_characters.begin(), _characters.end(), source_index, PredicateSourceToCharacter(this)) - _characters.begin();}
742 /** given an x coordinate and a line number, returns an iterator
743 pointing to the closest cursor position on that line to the
744 coordinate. */
745 iterator _cursorXOnLineToIterator(unsigned line_index, double local_x) const;
747 /** calculates the width of a chunk, which is the largest x
748 coordinate (start or end) of the spans contained within it. */
749 double _getChunkWidth(unsigned chunk_index) const;
750 };
752 /** \brief Holds a position within the glyph output of Layout.
754 Used to access the output of a Layout, query information and generally
755 move around in it. See Layout for a glossary of the names of functions.
757 I'm not going to document all the methods because most of their names make
758 their function self-evident.
760 A lot of the functions would do the same thing in a naive implementation
761 for latin-only text, for example nextCharacter(), nextCursorPosition() and
762 cursorRight(). Generally it's fairly obvious which one you should use in a
763 given situation, but sometimes you might need to put some thought in to it.
765 All the methods return false if the requested action would have caused the
766 current position to move out of bounds. In this case the position is moved
767 to either begin() or end(), depending on which direction you were going.
769 Note that some characters do not have a glyph representation (eg line
770 breaks), so if you try using prev/nextGlyph() from one of these you're
771 heading for a crash.
772 */
773 class Layout::iterator {
774 public:
775 friend class Layout;
776 // this is just so you can create uninitialised iterators - don't actually try to use one
777 iterator() : _parent_layout(NULL) {}
778 // no copy constructor required, the default does what we want
779 bool operator== (iterator const &other) const
780 {return _glyph_index == other._glyph_index && _char_index == other._char_index;}
781 bool operator!= (iterator const &other) const
782 {return _glyph_index != other._glyph_index || _char_index != other._char_index;}
784 /* mustn't compare _glyph_index in these operators because for characters
785 that don't have glyphs (line breaks, elided soft hyphens, etc), the glyph
786 index is -1 which makes them not well-ordered. To be honest, interating by
787 glyphs is not very useful and should be avoided. */
788 bool operator< (iterator const &other) const
789 {return _char_index < other._char_index;}
790 bool operator<= (iterator const &other) const
791 {return _char_index <= other._char_index;}
792 bool operator> (iterator const &other) const
793 {return _char_index > other._char_index;}
794 bool operator>= (iterator const &other) const
795 {return _char_index >= other._char_index;}
797 /* **** visual-oriented methods **** */
799 //glyphs
800 inline bool prevGlyph();
801 inline bool nextGlyph();
803 //span
804 bool prevStartOfSpan();
805 bool thisStartOfSpan();
806 bool nextStartOfSpan();
808 //chunk
809 bool prevStartOfChunk();
810 bool thisStartOfChunk();
811 bool nextStartOfChunk();
813 //line
814 bool prevStartOfLine();
815 bool thisStartOfLine();
816 bool nextStartOfLine();
817 bool thisEndOfLine();
819 //shape
820 bool prevStartOfShape();
821 bool thisStartOfShape();
822 bool nextStartOfShape();
824 /* **** text-oriented methods **** */
826 //characters
827 inline bool nextCharacter();
828 inline bool prevCharacter();
830 bool nextCursorPosition();
831 bool prevCursorPosition();
832 bool nextLineCursor();
833 bool prevLineCursor();
835 //words
836 bool nextStartOfWord();
837 bool prevStartOfWord();
838 bool nextEndOfWord();
839 bool prevEndOfWord();
841 //sentences
842 bool nextStartOfSentence();
843 bool prevStartOfSentence();
844 bool nextEndOfSentence();
845 bool prevEndOfSentence();
847 //paragraphs
848 bool prevStartOfParagraph();
849 bool thisStartOfParagraph();
850 bool nextStartOfParagraph();
851 //no endOfPara methods because that's just the previous char
853 //sources
854 bool prevStartOfSource();
855 bool thisStartOfSource();
856 bool nextStartOfSource();
858 //logical cursor movement
859 bool cursorUp();
860 bool cursorDown();
861 bool cursorLeft();
862 bool cursorRight();
864 //logical cursor movement (by word or paragraph)
865 bool cursorUpWithControl();
866 bool cursorDownWithControl();
867 bool cursorLeftWithControl();
868 bool cursorRightWithControl();
870 private:
871 Layout const *_parent_layout;
872 int _glyph_index; /// index into Layout::glyphs, or -1
873 unsigned _char_index; /// index into Layout::character
874 bool _cursor_moving_vertically;
875 /** for cursor up/down movement we must maintain the x position where
876 we started so the cursor doesn't 'drift' left or right with the repeated
877 quantization to character boundaries. */
878 double _x_coordinate;
880 inline iterator(Layout const *p, unsigned c, int g)
881 : _parent_layout(p), _glyph_index(g), _char_index(c), _cursor_moving_vertically(false), _x_coordinate(0.0) {}
882 inline iterator(Layout const *p, unsigned c)
883 : _parent_layout(p), _glyph_index(p->_characters[c].in_glyph), _char_index(c), _cursor_moving_vertically(false), _x_coordinate(0.0) {}
884 // no dtor required
885 void beginCursorUpDown(); /// stores the current x coordinate so that the cursor won't drift. See #_x_coordinate
887 /** moves forward or backwards one cursor position according to the
888 directionality of the current paragraph, but ignoring block progression.
889 Helper for the cursor*() functions. */
890 bool _cursorLeftOrRightLocalX(Direction direction);
892 /** moves forward or backwards by until the next character with
893 is_word_start according to the directionality of the current paragraph,
894 but ignoring block progression. Helper for the cursor*WithControl()
895 functions. */
896 bool _cursorLeftOrRightLocalXByWord(Direction direction);
897 };
899 // ************************** inline methods
901 inline SPCurve* Layout::convertToCurves() const
902 {return convertToCurves(begin(), end());}
904 inline Layout::iterator Layout::begin() const
905 {return iterator(this, 0, 0);}
907 inline Layout::iterator Layout::end() const
908 {return iterator(this, _characters.size(), _glyphs.size());}
910 inline Layout::iterator Layout::charIndexToIterator(int char_index) const
911 {
912 if (char_index < 0) return begin();
913 if (char_index >= (int)_characters.size()) return end();
914 return iterator(this, char_index);
915 }
917 inline int Layout::iteratorToCharIndex(Layout::iterator const &it) const
918 {return it._char_index;}
920 inline void Layout::validateIterator(Layout::iterator *it) const
921 {
922 it->_parent_layout = this;
923 if (it->_char_index >= _characters.size()) {
924 it->_char_index = _characters.size();
925 it->_glyph_index = _glyphs.size();
926 } else
927 it->_glyph_index = _characters[it->_char_index].in_glyph;
928 }
930 inline Layout::iterator Layout::getNearestCursorPositionTo(NR::Point &point) const
931 {return getNearestCursorPositionTo(point[0], point[1]);}
933 inline Layout::iterator Layout::getLetterAt(NR::Point &point) const
934 {return getLetterAt(point[0], point[1]);}
936 inline unsigned Layout::lineIndex(iterator const &it) const
937 {return it._char_index == _characters.size() ? _lines.size() - 1 : _characters[it._char_index].chunk(this).in_line;}
939 inline unsigned Layout::shapeIndex(iterator const &it) const
940 {return it._char_index == _characters.size() ? _input_wrap_shapes.size() - 1 : _characters[it._char_index].line(this).in_shape;}
942 inline bool Layout::isWhitespace(iterator const &it) const
943 {return it._char_index == _characters.size() || _characters[it._char_index].char_attributes.is_white;}
945 inline int Layout::characterAt(iterator const &it) const
946 {
947 void *unused;
948 Glib::ustring::iterator text_iter;
949 getSourceOfCharacter(it, &unused, &text_iter);
950 return *text_iter;
951 }
953 inline bool Layout::isCursorPosition(iterator const &it) const
954 {return it._char_index == _characters.size() || _characters[it._char_index].char_attributes.is_cursor_position;}
956 inline bool Layout::isStartOfWord(iterator const &it) const
957 {return it._char_index != _characters.size() && _characters[it._char_index].char_attributes.is_word_start;}
959 inline bool Layout::isEndOfWord(iterator const &it) const
960 {return it._char_index == _characters.size() || _characters[it._char_index].char_attributes.is_word_end;}
962 inline bool Layout::isStartOfSentence(iterator const &it) const
963 {return it._char_index != _characters.size() && _characters[it._char_index].char_attributes.is_sentence_start;}
965 inline bool Layout::isEndOfSentence(iterator const &it) const
966 {return it._char_index == _characters.size() || _characters[it._char_index].char_attributes.is_sentence_end;}
968 inline unsigned Layout::paragraphIndex(iterator const &it) const
969 {return it._char_index == _characters.size() ? _paragraphs.size() - 1 : _characters[it._char_index].line(this).in_paragraph;}
971 inline Layout::Alignment Layout::paragraphAlignment(iterator const &it) const
972 {return _paragraphs[paragraphIndex(it)].alignment;}
974 inline bool Layout::iterator::nextGlyph()
975 {
976 _cursor_moving_vertically = false;
977 if (_glyph_index >= (int)_parent_layout->_glyphs.size() - 1) {
978 if (_glyph_index == (int)_parent_layout->_glyphs.size()) return false;
979 _char_index = _parent_layout->_characters.size();
980 _glyph_index = _parent_layout->_glyphs.size();
981 }
982 else _char_index = _parent_layout->_glyphs[++_glyph_index].in_character;
983 return true;
984 }
986 inline bool Layout::iterator::prevGlyph()
987 {
988 _cursor_moving_vertically = false;
989 if (_glyph_index == 0) return false;
990 _char_index = _parent_layout->_glyphs[--_glyph_index].in_character;
991 return true;
992 }
994 inline bool Layout::iterator::nextCharacter()
995 {
996 _cursor_moving_vertically = false;
997 if (_char_index + 1 >= _parent_layout->_characters.size()) {
998 if (_char_index == _parent_layout->_characters.size()) return false;
999 _char_index = _parent_layout->_characters.size();
1000 _glyph_index = _parent_layout->_glyphs.size();
1001 }
1002 else _glyph_index = _parent_layout->_characters[++_char_index].in_glyph;
1003 return true;
1004 }
1006 inline bool Layout::iterator::prevCharacter()
1007 {
1008 _cursor_moving_vertically = false;
1009 if (_char_index == 0) return false;
1010 _glyph_index = _parent_layout->_characters[--_char_index].in_glyph;
1011 return true;
1012 }
1014 }//namespace Text
1015 }//namespace Inkscape
1017 #endif
1020 /*
1021 Local Variables:
1022 mode:c++
1023 c-file-style:"stroustrup"
1024 c-file-offsets:((innamespace . 0)(inline-open . 0)(case-label . +))
1025 indent-tabs-mode:nil
1026 fill-column:99
1027 End:
1028 */
1029 // vim: filetype=cpp:expandtab:shiftwidth=4:tabstop=8:softtabstop=4:encoding=utf-8:textwidth=99 :