lightdarkdefaultcompact
Getting StartedInstallationChangelogSupportGithubIntegrationsAngularVueReactDemosFoundationThemesTypographyIconsPopoversInternationalizationDrag and DropKeyboard NavigationLayoutGetting StartedBlockInlineGridComponentsAccordionAlertAlert GroupBadgeBreadcrumbButtonButton GroupButton ExpandButton HandleButton IconButton ResizeButton SortCardChatCheckboxColorData GridDateDialogDividerDrawerDropdownFileFormat BytesFormat DatetimeFormat NumberFormat TokenFormsForm InteractionsForm ValidationHeaderInputMenuMonthNavPagePaginationPanelPasswordProgress BarProgress CircleProgress DotRadioRangeRatingSearchSelectStepperSwitchTabsTagTextareaTimeToastToggletipTooltipTreeData GridGetting StartedFooterPlaceholderAsyncResponsiveHeightPaginationBordersHoverLayerRange SelectionCSVClipboardColumn AlignmentColumn WidthColumn FixedColumn StickyColumn VisibilityColumn GroupsColumn SpanColumn ResizeDraggable ColumnsDraggable RowsRow HeaderRow Multi SelectRow Single SelectRow HeightRow ActionRow Action BulkRow StickyRow StripeRow FixedRow SortRow Groups

Format Token

The format token component visualizes text tokenization for language models, displaying how text is split into tokens using various tokenization strategies like WordPiece, BPE, SentencePiece, and LLaMA.

Example

Hello world! This is a test of token segmentation.
code
<script type="module">
  import '@blueprintui/components/include/format-token.js';
</script>

<bp-format-token>Hello world! This is a test of token segmentation.</bp-format-token>

Formats

BPE (GPT-style)

Hello world! This is a test.

WordPiece (BERT-style)

Hello world! This is a test.

SentencePiece

Hello world! This is a test.

LLaMA

Hello world! This is a test.

Character-level

Hello world!

Whitespace

Hello world! This is a test.
code
<script type="module">
  import '@blueprintui/components/include/format-token.js';
</script>

<div bp-layout="block gap:md">
  <div>
    <h4>BPE (GPT-style)</h4>
    <bp-format-token format="bpe">Hello world! This is a test.</bp-format-token>
  </div>

  <div>
    <h4>WordPiece (BERT-style)</h4>
    <bp-format-token format="word-piece">Hello world! This is a test.</bp-format-token>
  </div>

  <div>
    <h4>SentencePiece</h4>
    <bp-format-token format="sentence-piece">Hello world! This is a test.</bp-format-token>
  </div>

  <div>
    <h4>LLaMA</h4>
    <bp-format-token format="llama">Hello world! This is a test.</bp-format-token>
  </div>

  <div>
    <h4>Character-level</h4>
    <bp-format-token format="character">Hello world!</bp-format-token>
  </div>

  <div>
    <h4>Whitespace</h4>
    <bp-format-token format="whitespace">Hello world! This is a test.</bp-format-token>
  </div>
</div>

Install

NPM

// npm package
import '@blueprintui/components/include/format-token.js';

CDN

<script type="module">
  import 'https://cdn.jsdelivr.net/npm/@blueprintui/components/include/format-token.js/+esm';
</script>

bp-format-token

Properties

NameTypesDescription
format| 'bpe' | 'word-piece' | 'sentence-piece' | 'llama' | 'character' | 'whitespace'Tokenization format/strategy to use
tokensstring[]

Attributes

NameTypesDescription
format| 'bpe' | 'word-piece' | 'sentence-piece' | 'llama' | 'character' | 'whitespace'Tokenization format/strategy to use

CSS Properties

NameTypesDescription
--padding
--border-radius
--border
--font-family
--line-height
--gap

Slots

NameTypesDescription
defaultProvide text content to be tokenized