Pixie is at KubeCon EU (May 16-20, 2022). Learn more🚀
Pixie is at KubeCon EU!
pixie logo
Ctrl/Cmd + K
DocsGET STARTED
Ctrl/Cmd + K

Tables are Hard, Part 3: Streaming DataPermalink

Nick Lanam
April 06, 20224 minutes read
Lead SWE @ New Relic, Founding Engineer @ Pixie Labs

Let's build a tailing log viewer in React. Or at least, a facsimile of one.

In our previous post, we made a table with sorting, filtering, and column controls.

This time, we'll upgrade it to handle a constant barrage of data a bit like a log viewer would.

This tutorial is aimed at relative beginners to web development; we'll keep the set of tools and features simple to remain focused. Skip to the end for suggestions for a more serious project.

Final Code | Live Demo

A quick video demonstration of what we'll build.

This article is part of a series:

Setting UpPermalink

To begin, you'll need the same basic setup as the previous article in this series:

  • A modern browser.
  • NodeJS 14, NPM 5.2 (NodeJS 14 comes with NPM 6).
  • Something to edit code with.
  • A basic understanding of the command line, HTML, CSS, JavaScript, and React.

Using the same principles as the previous article, we'll start off with a fresh template:

git clone https://github.com/pixie-io/pixie-demos.git
cd pixie-demos/react-table/6-new-base
npm install
npm start

If this succeeds, your browser should open a demo like this running on localhost:3000.

This template has different mock data and some adjusted CSS compared to the final demo in the previous article, but the features are unchanged.

Excessive Data; Virtual ScrollingPermalink

Since we're looking at definitely-legitimate log data, let's see what happens if we have a few thousand rows with our current code:

- const data = useStaticData(100);
+ const data = useStaticData(50_000);

If you save that change, the page will update itself and you'll immediately see some problems:

  • Even on a fast computer, the page takes a while to show data and become responsive.
  • Sorting, filtering, and resizing data makes the page freeze for a moment.
  • With enough rows, it can even crash the browser tab.
  • Even scrolling the table feels sluggish.

With this many rows, React spends a lot of time rendering them every time context changes1. This happens on the main thread and blocks just about everything2.

But why update tens of thousands of rows when only a handful are visible at a time? Can't we render just the visible rows? This technique is called virtual scrolling, and it's a complex topic.

In modern browsers, virtual scrolling can be built using tools like IntersectionObserver, asynchronous scroll handlers, and a headlamp (to brave the Mines of MDN, the Sea of StackOverflow, and other odysseys).

Rather than going down that path in the forest of all knowledge, we'll use a pre-built solution for virtual scrolling in React: react-window.

Adding it is straightforward at first. npm install react-window, then a few tweaks to Table.js:

...
import {
useSortBy,
useResizeColumns,
} from 'react-table';
+import { FixedSizeList } from 'react-window';
import './Table.css';
...
export default function Table({ data: { columns, data } }) {
))}
</thead>
<tbody {...getTableBodyProps()}>
- {rows.map((row) => {
- prepareRow(row);
- return (
- <tr {...row.getRowProps()}>
+ {/*
+ * This has a few problems once we use 10,000 rows and add virtual scrolling:
+ * - At 10,000 rows, sorting and filtering slow down quite a lot. We'll address that later.
+ * Virtual scrolling makes scrolling and resizing columns fast again, but sorting and filtering still chug.
+ * - The table's height is no longer dynamic. This can be fixed by detecting the parent's dimensions.
+ * - If the user's browser shows layout-impacting scrollbars (Firefox does so by default for example),
+ * the header is not part of the scrolling area and thus has a different width than the scroll body.
+ * This can be fixed by detecting how wide the scrollbar is and whether it's present, then using that
+ * to adjust the <thead/> width accordingly.
+ */}
+ <FixedSizeList
+ itemCount={rows.length}
+ height={300}
+ itemSize={34}
+ >
+ {({ index, style }) => {
+ const row = rows[index];
+ prepareRow(row);
+ return <tr {...row.getRowProps({ style })}>
{row.cells.map(cell => (
<td {...cell.getCellProps()}>
{cell.render('Cell')}
</td>
))}
</tr>
- );
- })}
+ }}
+ </FixedSizeList>
</tbody>
</table>
</div>

Saving that, we immediately see better scroll performance. But we see a few problems too:

That doesn't look right...
That doesn't look right...
  • Layout and CSS are, to put it gently, hosed. Especially in browsers with layout-impacting scrollbars3!
  • The browser console complains that we're no longer using a valid structure for our table.
  • Scrolling rows and resizing columns are fast now, but sorting and filtering are still slow. They still manipulate the full set of rows.

To fix these issues, we'll do a few things:

  • CSS cleanup: fitting the table into a flexible space that won't overflow the viewport.
  • Making automatic height and virtual scrolling work together: a wrapper element and a React hook to pass that wrapper's dimensions to react-window.
  • Handling scrollbar sizes: another hook to compute sizes, and a right margin on the table's header
  • Invalid DOM: we aren't supposed to mix native <table> elements with <div>s. Switching to entirely use <div>s corrects that. We can use the role attribute, which react-table provides through getTableBodyProps() and friends, to keep semantic HTML and keep screen readers happy.

With those fixes in place, we have a much nicer demonstration of virtual scrolling.

Much better.
Much better.

Streaming Even More DataPermalink

Now imagine the server is streaming chunks of log data to add to the table every second. How should we handle that?

First, let's stop imagining and actually implement a (mock) data stream:

export function useStreamingData(
rowsPerBatch = 5, delay = 1000, maxBatches = 100
) {
const [minTimestamp, setMinTimestamp] = React.useState(
Date.now() - 1000 * 60 * 60 * 24 * 7 // Logs since last week
);
const [batches, setBatches] = React.useState([]);
const addBatch = React.useCallback(() => {
if (batches.length >= maxBatches) return;
const batch = Array(rowsPerBatch).fill(0).map(
(_, i) => generateRow(
minTimestamp + i * 1000,
minTimestamp + i * 1999
)
);
setBatches([...batches, batch]);
setMinTimestamp(minTimestamp + rowsPerBatch * 2000);
}, [batches, maxBatches, minTimestamp, rowsPerBatch]);
React.useEffect(() => {
const newTimer = global.setInterval(addBatch, delay);
return () => {
global.clearInterval(newTimer);
};
}, [delay, addBatch]);
return React.useMemo(
() => ({ columns, data: batches.flat() }),
[batches]
);
}

This doesn't do much: it just adds a new chunk of rows to the data going to react-table every second, up to a (low) safety limit. We'll increase the amount of data and the rate in a moment; but first we have new problems to solve:

Streaming Broke Everything: A Case Study
  • Scroll position jumps to the top every time a batch is added.
  • Sorting and filtering both reset every time a batch is added.
  • Virtual scrolling doesn't render until you scroll again; blank rows appear instead.

What gives? Everything was working a moment ago.

Every time we change the data backing our instance of react-table, both it and react-window reset a number of variables. Luckily for us, these have easy solutions:

  • Sorting, column resizing: we can turn that resetting behavior off by adding autoResetSortBy: false and autoResetResize: false to the useTable configuration.
  • Scroll reset, blank rows: a bit of React.memo clears that right up4.
+const TableContext = React.createContext(null);
+
+/**
+ * By memoizing this, we ensure that react-window can recycle rendered rows that haven't changed when new data comes in.
+ */
+const BodyRow = React.memo(({ index, style }) => {
+ const { rows, instance: { prepareRow } } = React.useContext(TableContext);
+ const row = rows[index];
+ prepareRow(row);
+ return (
+ <div className={styles.Row} {...row.getRowProps({ style })}>
+ {row.cells.map(cell => (
+ <div className={styles.BodyCell} {...cell.getCellProps()}>
+ {cell.render('Cell')}
+ </div>
+ ))}
+ </div>
+ );
+});
+
+/**
+ * Setting outerElementType on FixedSizeList lets us override properties on the scroll container. However, FixedSizeList
+ * redraws this on every data change, so the wrapper component needs to be memoized for scroll position to be retained.
+ *
+ * Note: If the list is sorted such that new items are added to the top, the items in view will still change
+ * because the ones that _were_ at that scroll position were pushed down.
+ * This can be accounted in a more complete implementation, but it's out of scope of this demonstration.
+ */
+const ForcedScrollWrapper = React.memo((props, ref) => (
+ // Instead of handling complexity with when the scrollbar is/isn't visible for this basic tutorial,
+ // instead force the scrollbar to appear even when it isn't needed. Not great, but out of scope.
+ <div {...props} style={{ ...props.style, overflowY: 'scroll' }} forwardedRef={ref}></div>
+));
+
-export default function Table({ data: { columns, data } }) {
+const Table = React.memo(({ data: { columns, data } }) => {
const reactTable = useTable({
columns,
data,
+ autoResetSortBy: false,
+ autoResetResize: false,
},
useFlexLayout,
useGlobalFilter,
useSortBy,
useResizeColumns
);
const {
getTableProps,
getTableBodyProps,
headerGroups,
rows,
allColumns,
- prepareRow,
setGlobalFilter
} = reactTable;
const { width: scrollbarWidth } = useScrollbarSize();
const [fillContainer, setFillContainer] = React.useState(null)
const fillContainerRef = React.useCallback((el) => setFillContainer(el), []);
const { height: fillHeight } = useContainerSize(fillContainer);
+ const context = React.useMemo(() => ({
+ instance: reactTable,
+ rows,
+ // By also watching reactTable.state specifically, we make sure that resizing columns is reflected immediately.
+ // eslint-disable-next-line react-hooks/exhaustive-deps
+ }), [reactTable, rows, reactTable.state]);
+
return (
+ <TableContext.Provider value={context}>
<div className={styles.root}>
<header>
<ColumnSelector columns={allColumns} />
<Filter onChange={setGlobalFilter} />
</header>
<div className={styles.fill} ref={fillContainerRef}>
<div {...getTableProps()} className={styles.Table}>
<div className={styles.TableHead}>
...
</div>
<div className={styles.TableBody} {...getTableBodyProps()}>
<FixedSizeList
- outerElementType={(props, ref) => (
- // Instead of handling complexity with when the scrollbar is/isn't visible for this basic tutorial,
- // we'll instead force the scrollbar to appear even when it isn't needed. Suboptimal, but out of scope.
- <div {...props} style={{ ...props.style, overflowY: 'scroll' }} forwardedRef={ref}></div>
- )}
+ outerElementType={ForcedScrollWrapper}
itemCount={rows.length}
height={fillHeight - 56}
itemSize={34}
+ onItemsRendered={onRowsRendered}
>
- {({ index, style }) => {
- const row = rows[index];
- prepareRow(row);
- return <div className={styles.Row} {...row.getRowProps({ style })}>
- {row.cells.map(cell => (
- <div className={styles.BodyCell} {...cell.getCellProps()}>
- {cell.render('Cell')}
- </div>
- ))}
- </div>
- }}
+ {BodyRow}
</FixedSizeList>
</div>
</div>
</div>
</div>
</div>
+ </TableContext.Provider>
);
-}
+});
+
+export default Table;

FrillsPermalink

We're nearly done now. For a bit of final polish, we'll tweak useData.js one more time to keep adding more data and removing old data when it hits the limit; we'll also add a counter under the table to track visible and total rows. While we're here, let's default to sorting by timestamp too.

...
const reactTable = useTable({
columns,
data,
autoResetSortBy: false,
autoResetResize: false,
+ disableSortRemove: true,
+ initialState: {
+ sortBy: [{ id: 'timestamp', desc: false }],
+ },
},
useFlexLayout,
useGlobalFilter,
useSortBy,
useResizeColumns
);
...
+ const [visibleStart, setVisibleStart] = React.useState(1);
+ const [visibleStop, setVisibleStop] = React.useState(1);
+ const viewportDetails = React.useMemo(() => {
+ const count = visibleStop - visibleStart + 1;
+ let text = `Showing ${visibleStart + 1} - ${visibleStop + 1} / ${rows.length} records`;
+ if (rows.length === 500) text += ' (most recent only)';
+
+ if (count <= 0) {
+ text = 'No records to show';
+ } else if (count >= rows.length) {
+ text = ' '; // non-breaking space
+ }
+ return text;
+ }, [rows.length, visibleStart, visibleStop]);
+
+ const onRowsRendered = React.useCallback(({ visibleStartIndex, visibleStopIndex }) => {
+ setVisibleStart(visibleStartIndex);
+ setVisibleStop(visibleStopIndex);
+ }, []);
...
return (
<TableContext.Provider value={context}>
<div className={styles.root}>
<header>
<ColumnSelector columns={allColumns} />
<Filter onChange={setGlobalFilter} />
</header>
<div className={styles.fill} ref={fillContainerRef}>
<div {...getTableProps()} className={styles.Table}>
<div className={styles.TableHead}>
...
</div>
<div className={styles.TableBody} {...getTableBodyProps()}>
...
</div>
</div>
</div>
+ <div className={styles.ViewportDetails} style={{ marginRight: scrollbarWidth }}>
+ {viewportDetails}
+ </div>
</div>
</div>
</TableContext.Provider>
);
...

At last, we have a fully working demo (click to interact):

ConclusionPermalink

With our final code, we've built a complete demo:

  • Data streams to the table constantly, without interrupting scrolling or sorting.
  • Data can be sorted and filtered.
  • Columns can be selectively resized and hidden.
  • The table continues to perform smoothly indefinitely.

While this doesn't cover nearly everything that a real log viewer does, it does introduce one of the harder topics.

There's a great deal more to explore in this space, especially in terms of browser performance and complex data. Tables are hard, but we hope these articles have provided a foundation to learn and build something awesome yourself.

Footnotes


  1. React 18 can help a little bit with this if you're clever, but you're better off fixing the root cause(s) first.
  2. JavaScript's event loop, web workers, DOM repaints, and more are out of scope for this tutorial. This primer on the pixel pipeline is a good start if you wish to learn the basics. Performance is complicated!
  3. By default, scrollbars in Firefox take up space and affect layout. This used to be normal in all browsers. These days, most browsers overlay scrollbars as if they had a width of zero. You can detect this with element.offsetWidth - element.clientWidth (element.offsetHeight - element.clientHeight for horizontal scrollbars). If the element isn't scrolling or if scrollbars are set to overlay, these values will be 0; if not, they'll tell you how much space the scrollbars take up.
  4. We wrote a blog post on memoization. For this code, it's to retain state and not for performance.

Related posts

Terms of Service|Privacy Policy

We are a Cloud Native Computing Foundation sandbox project.

CNCF logo

Pixie was originally created and contributed by New Relic, Inc.

Copyright © 2018 - The Pixie Authors. All Rights Reserved. | Content distributed under CC BY 4.0.
The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage Page.
Pixie was originally created and contributed by New Relic, Inc.

This site uses cookies to provide you with a better user experience. By using Pixie, you consent to our use of cookies.