NoSQL data store using the seastar framework, compatible with Apache Cassandra

Overview

Scylla

Slack Twitter

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and Thrift. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

  • The users mailing list and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
  • The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.
Comments
  • c-s latency caused by high latency from peer node

    c-s latency caused by high latency from peer node

    1. Start 2 nodes n1, n2 using recent scylla master 1fd701e
    2. Enable slow query curl -X POST "http://127.0.0.1:10000/storage_service/slow_query?enable=true&fast=false&threshold=80000" curl -X POST "http://127.0.0.2:10000/storage_service/slow_query?enable=true&fast=false&threshold=80000"
    3. Start c-s cassandra-stress write no-warmup cl=TWO n=5000000 -schema 'replication(factor=2)' -port jmx=6868 -mode cql3 native -rate threads=200 -col 'size=FIXED(5) n=FIXED(8)' -pop seq=1500000000..2500000000
    4. Run repair to make c-s latency high to trigger the slow query tracing

    See the following trace, node 127.0.0.2 applies the write very fast (less than 100us), while the remote node 127.0.0.1 took 295677 us. This means the 300ms c-s latency seen by the client (c-s) were mostly contributed by the remote node. Due to the tracing issues I reported here https://github.com/scylladb/scylla/issues/9403, we do not know where the time was spent on the remote took. It might be disk or network or cpu contention. But I have a feeling, the contention is from network when repair runs since we do not have a network scheduler. So the theory is that the remote node applies the write very quickly, but either the network rpc message to send the request or response are contented, so in the end, node 127.0.0.2 got the response with a high latency.

    cqlsh> SELECT * from system_traces.events WHERE session_id=ea0a5cc0-2021-11ec-be32-b254958ec4a2;
    
     session_id                           | event_id                             | activity                                                                                           | scylla_parent_id | scylla_span_id  | source    | source_elapsed | thread
    --------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------+------------------+-----------------+-----------+----------------+---------
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a770a-2021-11ec-be32-b254958ec4a2 |                                                                                    Checking bounds |                0 | 373048741859841 | 127.0.0.2 |              0 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a770f-2021-11ec-be32-b254958ec4a2 |                                                                             Processing a statement |                0 | 373048741859841 | 127.0.0.2 |              0 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a781b-2021-11ec-be32-b254958ec4a2 | Creating write handler for token: -6493410074079723942 natural: {127.0.0.1, 127.0.0.2} pending: {} |                0 | 373048741859841 | 127.0.0.2 |             27 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a782e-2021-11ec-be32-b254958ec4a2 |                                  Creating write handler with live: {127.0.0.1, 127.0.0.2} dead: {} |                0 | 373048741859841 | 127.0.0.2 |             29 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a7850-2021-11ec-be32-b254958ec4a2 |                                                                 X Sending a mutation to /127.0.0.1 |                0 | 373048741859841 | 127.0.0.2 |             32 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a786a-2021-11ec-be32-b254958ec4a2 |                                                                     X Executing a mutation locally |                0 | 373048741859841 | 127.0.0.2 |             35 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a7993-2021-11ec-be32-b254958ec4a2 |                                                            Z Finished executing a mutation locally |                0 | 373048741859841 | 127.0.0.2 |             65 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a799c-2021-11ec-be32-b254958ec4a2 |                                                                     Got a response from /127.0.0.2 |                0 | 373048741859841 | 127.0.0.2 |             65 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a79e0-2021-11ec-be32-b254958ec4a2 |                                                        Z Finished Sending a mutation to /127.0.0.1 |                0 | 373048741859841 | 127.0.0.2 |             72 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea3794ed-2021-11ec-be32-b254958ec4a2 |                                                                     Got a response from /127.0.0.1 |                0 | 373048741859841 | 127.0.0.2 |         295677 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea3794f3-2021-11ec-be32-b254958ec4a2 |                                       Delay decision due to throttling: do not delay, resuming now |                0 | 373048741859841 | 127.0.0.2 |         295677 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea3797f8-2021-11ec-be32-b254958ec4a2 |                                                                    Mutation successfully completed |                0 | 373048741859841 | 127.0.0.2 |         295755 | shard 0
     ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea379808-2021-11ec-be32-b254958ec4a2 |                                                               Done processing - preparing a result |                0 | 373048741859841 | 127.0.0.2 |         295756 | shard 0
    
    (13 rows)
    
    latency Backport candidate Eng-3 
    opened by asias 203
  • Node stuck 12 hours in decommission

    Node stuck 12 hours in decommission

    Installation details Scylla version (or git commit hash): 3.1.0.rc5-0.20190902.623ea5e3d Cluster size: 4 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-02055ad6b0af5669b

    We see that Thrift and CQL ports are closed but nodetool command is stuck

    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] compaction - Compacted 1 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-280840-big-Data.db:level=2,
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-279118-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-280714-big-Data.db:level=1, /var/lib/scy
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: unbootstrap done
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - Thrift server stopped
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - CQL server stopped
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: shutdown rpc and cql server done
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: stop batchlog_manager done
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] gossip - My status = LEFT
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] gossip - No local state or state is in silent shutdown, not announcing shutdown
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: stop_gossiping done
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 12] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 10] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 9] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 8] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 9] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 4] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 4] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 2] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 3] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 7] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 2] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 3] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 5] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 5] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 5] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacted 1 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-286124-big-Data.db:level=2,
    Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-285942-big-Data.db:level=1, ]
    Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacted 1 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-286138-big-Data.db:level=2,
    Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-285956-big-Data.db:level=1, ]
    Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 7] compaction - Compacted 9 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-282149-big-Data.db:level=3,
    Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 7] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-278747-big-Data.db:level=3, /var/lib/scy
    Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 10] compaction - Compacting [/var/lib/scylla/data/system/large_rows-40550f66085839a09430f27fc08034e9/mc-4252-big-Data.db:level=0, /var/lib/scylla
    Sep 04 22:43:35 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 10] compaction - Compacted 2 sstables to [/var/lib/scylla/data/system/large_rows-40550f66085839a09430f27fc08034e9/mc-4266-big-Data.db:level=0, ].
    

    nodetool stuck more than 12 hours

    [root@ip-10-0-142-68 centos]# ps -fp 119286
    UID         PID   PPID  C STIME TTY          TIME CMD
    centos   119286   1759  0 Sep04 ?        00:00:00 /bin/sh /usr/bin/nodetool -u cassandra -pw cassandra decommission
    [root@ip-10-0-142-68 centos]# date
    Thu Sep  5 12:57:52 UTC 2019
    [root@ip-10-0-142-68 centos]#
    

    Probably related to issue with nodetool drain stuck #4891 and old issue #961

    bug Regression 
    opened by bentsi 197
  • resharding + alternator LWT -> Scylla service takes 36 minutes to start

    resharding + alternator LWT -> Scylla service takes 36 minutes to start

    Installation details

    Kernel Version: 5.13.0-1021-aws

    Scylla version (or git commit hash): 2022.1~rc3-20220406.5cc3b678c with build-id 48dfae0735cd8efc4ae2f5c777beaee2a1e89f4a

    Cluster size: 4 nodes (i3.4xlarge)

    Scylla Nodes used in this run:

    • alternator-48h-2022-1-db-node-81cb61d9-5 (34.241.246.188 | 10.0.2.75) (shards: 14)
    • alternator-48h-2022-1-db-node-81cb61d9-4 (52.30.41.107 | 10.0.3.6) (shards: 14)
    • alternator-48h-2022-1-db-node-81cb61d9-3 (52.214.185.121 | 10.0.1.89) (shards: 14)
    • alternator-48h-2022-1-db-node-81cb61d9-2 (34.242.68.250 | 10.0.1.112) (shards: 14)
    • alternator-48h-2022-1-db-node-81cb61d9-1 (176.34.90.117 | 10.0.0.237) (shards: 14)

    OS / Image: ami-071c70d20f0fdbb2c (aws: eu-west-1)

    Test: longevity-alternator-200gb-48h-test

    Test id: 81cb61d9-8d3f-45ae-8b50-f7882b4a6af8

    Test name: longevity/longevity-alternator-200gb-48h-test

    Test config file(s):

    Issue description

    At 2022-04-16 09:34:34.496 a restart with resharding nemesis has started on node 4. The nemesis shuts down the scylla service, edits the murmur3_partitioner_ignore_msb_bits config value to force resharding, and starts the scylla service again exepcting the initialization to take 5 minutes at most. When we start the service, however, it took 36 minutes for scylla to start:

    2022-04-16T09:36:11+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - installing SIGHUP handler
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Scylla version 2022.1.rc3-0.20220406.5cc3b678c with build-id 48dfae0735cd8efc4ae2f5c777beaee2a1e89f4a starting ...
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting prometheus API server
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting tokens manager
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting effective_replication_map factory
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting migration manager notifier
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting lifecycle notifier
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - creating tracing
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - creating snitch
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting API server
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Scylla API server listening on 127.0.0.1:10000 ...
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing storage service
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting gossiper
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - seeds={10.0.0.237}, listen_address=10.0.3.6, broadcast_address=10.0.3.6
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing storage service
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting per-shard database core
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - creating and verifying directories
    2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting database
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting storage proxy
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting migration manager
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting query processor
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing batchlog manager
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - loading system sstables
    2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - loading non-system sstables
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting view update generator
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - setting up system keyspace
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting commit log
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing migration manager RPC verbs
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing storage proxy RPC verbs
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting streaming service
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting hinted handoff manager
    2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting messaging service
    2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting CDC Generation Management service
    2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting CDC log service
    2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting storage service
    2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting sstables loader
    2022-04-16T10:07:47+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting system distributed keyspace
    2022-04-16T10:11:47+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting tracing
    2022-04-16T10:11:48+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - SSTable data integrity checker is disabled.
    2022-04-16T10:11:48+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting auth service
    2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting batchlog manager
    2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting load meter
    2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting cf cache hit rate calculator
    2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting view update backlog broker
    2022-04-16T10:11:53+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Waiting for gossip to settle before accepting client requests...
    2022-04-16T10:12:06+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - allow replaying hints
    2022-04-16T10:12:07+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Launching generate_mv_updates for non system tables
    2022-04-16T10:12:07+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting the view builder
    2022-04-16T10:12:25+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting native transport
    2022-04-16T10:12:26+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting the expiration service
    2022-04-16T10:12:27+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - serving
    2022-04-16T10:12:27+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Scylla version 2022.1.rc3-0.20220406.5cc3b678c initialization completed.
    

    Namely, the loading phases took way longer than usual.

    • Restore Monitor Stack command: $ hydra investigate show-monitor 81cb61d9-8d3f-45ae-8b50-f7882b4a6af8
    • Restore monitor on AWS instance using Jenkins job
    • Show all stored logs command: $ hydra investigate show-logs 81cb61d9-8d3f-45ae-8b50-f7882b4a6af8

    Logs:

    db-cluster: https://cloudius-jenkins-test.s3.amazonaws.com/81cb61d9-8d3f-45ae-8b50-f7882b4a6af8/20220424_100353/db-cluster-81cb61d9.tar.gz loader-set: https://cloudius-jenkins-test.s3.amazonaws.com/81cb61d9-8d3f-45ae-8b50-f7882b4a6af8/20220424_100353/loader-set-81cb61d9.tar.gz monitor-set: https://cloudius-jenkins-test.s3.amazonaws.com/81cb61d9-8d3f-45ae-8b50-f7882b4a6af8/20220424_100353/monitor-set-81cb61d9.tar.gz

    Jenkins job URL

    Regression compaction resharding 
    opened by ShlomiBalalis 143
  • Coredumps during restart_then_repair_node nemesis

    Coredumps during restart_then_repair_node nemesis

    This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

    • [x] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

    Installation details Scylla version (or git commit hash):3.1.0.rc4-0.20190826.e4a39ed31 Cluster size:4 OS (RHEL/CentOS/Ubuntu/AWS AMI):ami-0ececa5cacea302a8

    During restart_then_repair_node, the target node (# 5) suffered from streaming exceptions:

    (DatabaseLogEvent Severity.CRITICAL): type=DATABASE_ERROR regex=Exception  line_number=26510 node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-5 [52.50.193.198 | 10.0.133.1] (seed: False)
    2019-08-27T22:03:51+00:00  ip-10-0-133-1 !WARNING | scylla: [shard 0] range_streamer - Bootstrap with 10.0.10.203 for keyspace=scylla_bench failed, took 773.173 seconds: streaming::stream_exception (Stream failed)
    

    While 2 other nodes suffered from semaphore timeouts (could be related to #4615)

    (DatabaseLogEvent Severity.CRITICAL): type=DATABASE_ERROR regex=Exception  line_number=14442 node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
    2019-08-27T22:06:09+00:00  ip-10-0-178-144 !ERR     | scylla: [shard 7] storage_proxy - Exception when communicating with 10.0.178.144: seastar::semaphore_timed_out (Semaphore timedout)
    

    and created coredumps like so:

    (CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
    corefile_urls=
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.4406.1566942687000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.4406.1566942687000000.gz.aa
    backtrace=           PID: 4406 (scylla)
               UID: 996 (scylla)
               GID: 1001 (scylla)
            Signal: 6 (ABRT)
         Timestamp: Tue 2019-08-27 21:51:27 UTC (1min 55s ago)
      Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
        Executable: /opt/scylladb/libexec/scylla
     Control Group: /
           Boot ID: 9f0393fe20f04dfab829e5bb5cc4bdad
        Machine ID: df877a200226bc47d06f26dae0736ec9
          Hostname: ip-10-0-178-144.eu-west-1.compute.internal
          Coredump: /var/lib/systemd/coredump/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.4406.1566942687000000
           Message: Process 4406 (scylla) of user 996 dumped core.
                    
                    Stack trace of thread 4430:
                    #0  0x00007f95cfcc953f raise (libc.so.6)
                    #1  0x00007f95cfcb395e abort (libc.so.6)
                    #2  0x00000000040219ab on_allocation_failure (scylla)
    

    Here I'll add links to all of those kind of coredumps, knowing that there's currently a bit of an issue with uploading them, hoping that one of them uploaded correctly:

    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.16744.1566943317000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.16744.1566943317000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17150.1566944278000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17150.1566944278000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17686.1566944862000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17686.1566944862000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18167.1566945503000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18167.1566945503000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18731.1566946375000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18731.1566946375000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.19423.1566947108000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.19423.1566947108000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz.aa
    
    
    (CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
    corefile_urls=
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz.aa
    backtrace=           PID: 20078 (scylla)
               UID: 996 (scylla)
               GID: 1001 (scylla)
            Signal: 6 (ABRT)
         Timestamp: Tue 2019-08-27 23:17:10 UTC (1min 57s ago)
      Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
        Executable: /opt/scylladb/libexec/scylla
     Control Group: /
           Boot ID: 9f0393fe20f04dfab829e5bb5cc4bdad
        Machine ID: df877a200226bc47d06f26dae0736ec9
          Hostname: ip-10-0-178-144.eu-west-1.compute.internal
          Coredump: /var/lib/systemd/coredump/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000
           Message: Process 20078 (scylla) of user 996 dumped core.
                    
                    Stack trace of thread 20089:
                    #0  0x00007fa119b2853f raise (libc.so.6)
                    #1  0x00007fa119b1295e abort (libc.so.6)
                    #2  0x0000000000469b8e _ZN8logalloc18allocating_section7reserveEv (scylla)
    

    Different backtraces and translation during the run:

    Aug 27 22:03:29 ip-10-0-10-203.eu-west-1.compute.internal scylla[5160]:  [shard 10] seastar - Failed to allocate 851968 bytes
    0x00000000041808b2
    0x000000000406d935
    0x000000000406dc35
    0x000000000406dce3
    0x00007f7420b4602f
    /opt/scylladb/libreloc/libc.so.6+0x000000000003853e
    /opt/scylladb/libreloc/libc.so.6+0x0000000000022894
    0x00000000040219aa
    0x0000000004022a0e
    0x000000000131bcb3
    0x000000000137d78f
    0x000000000131725f
    0x000000000136c8b1
    0x00000000014555b5
    0x0000000001296442
    0x000000000145c35a
    0x000000000406ae21
    0x000000000406b01e
    0x000000000414d06d
    0x00000000041776ab
    0x00000000040390dd
    /opt/scylladb/libreloc/libpthread.so.0+0x000000000000858d
    /opt/scylladb/libreloc/libc.so.6+0x00000000000fd6a2
    
    

    translated:

    void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::backtrace_buffer::append_backtrace() at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) print_with_backtrace at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:1104
    seastar::print_with_backtrace(char const*) at /usr/include/boost/program_options/variables_map.hpp:146
    sigabrt_action at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5012
     (inlined by) _FUN at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5008
    ?? ??:0
    ?? ??:0
    ?? ??:0
    seastar::memory::on_allocation_failure(unsigned long) at memory.cc:?
    operator new(unsigned long) at ??:?
     (inlined by) operator new(unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/memory.cc:1674
    seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::expand(unsigned long) at crtstuff.c:?
     (inlined by) seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::expand(unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/circular_buffer.hh:301
    sstables::promoted_index_blocks_reader::process_state(seastar::temporary_buffer<char>&, sstables::promoted_index_blocks_reader::m_parser_context&) at crtstuff.c:?
     (inlined by) seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::maybe_expand(unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/circular_buffer.hh:331
     (inlined by) void seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::emplace_back<position_in_partition, position_in_partition, unsigned long&, unsigned long&, std::optional<sstables::deletion_time> >(position_in_partition&&, position_in_partition&&, unsigned long&, unsigned long&, std::optional<sstables::deletion_time>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/circular_buffer.hh:391
     (inlined by) sstables::promoted_index_blocks_reader::process_state(seastar::temporary_buffer<char>&, sstables::promoted_index_blocks_reader::m_parser_context&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/./sstables/index_entry.hh:416
    data_consumer::continuous_data_consumer<sstables::promoted_index_blocks_reader>::process(seastar::temporary_buffer<char>&) at crtstuff.c:?
     (inlined by) sstables::promoted_index_blocks_reader::process_state(seastar::temporary_buffer<char>&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/./sstables/index_entry.hh:456
     (inlined by) data_consumer::continuous_data_consumer<sstables::promoted_index_blocks_reader>::process(seastar::temporary_buffer<char>&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/consumer.hh:404
    seastar::future<> seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}>(seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}) at crtstuff.c:?
     (inlined by) seastar::future<seastar::consumption_result<char> > std::__invoke_impl<seastar::future<seastar::consumption_result<char> >, sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char> >(std::__invoke_other, sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char>&&) at /usr/include/c++/8/bits/invoke.h:60
     (inlined by) std::__invoke_result<sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char> >::type std::__invoke<sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char> >(sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char>&&) at /usr/include/c++/8/bits/invoke.h:96
     (inlined by) std::result_of<sstables::promoted_index_blocks_reader& (seastar::temporary_buffer<char>&&)>::type std::reference_wrapper<sstables::promoted_index_blocks_reader>::operator()<seastar::temporary_buffer<char> >(seastar::temporary_buffer<char>&&) const at /usr/include/c++/8/bits/refwrap.h:319
     (inlined by) seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}::operator()() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:227
     (inlined by) seastar::future<> seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}>(seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:285
    sstables::index_reader::advance_upper_past(position_in_partition_view) at crtstuff.c:?
     (inlined by) seastar::future<> seastar::input_stream<char>::consume<sstables::promoted_index_blocks_reader>(sstables::promoted_index_blocks_reader&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:236
     (inlined by) data_consumer::continuous_data_consumer<sstables::promoted_index_blocks_reader>::consume_input() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/consumer.hh:377
     (inlined by) sstables::index_entry::get_next_pi_blocks() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/./sstables/index_entry.hh:614
     (inlined by) sstables::index_reader::advance_upper_past(position_in_partition_view) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/index_reader.hh:582
    seastar::future<bool> seastar::futurize<seastar::future<bool> >::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&, std::tuple<>&&) [clone .constprop.7996] at sstables.cc:?
     (inlined by) seastar::apply_helper<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#1}&&, std::tuple) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:35
     (inlined by) auto seastar::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:43
     (inlined by) seastar::future<bool> seastar::futurize<seastar::future<bool> >::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1392
    _ZN7seastar12continuationIZZNS_6futureIJEE9then_implIZN8sstables12index_reader34advance_lower_and_check_if_presentEN3dht18ring_position_viewESt8optionalI26position_in_partition_viewEEUlvE_NS1_IJbEEEEET0_OT_ENKUlvE_clEvEUlSF_E_JEE15run_and_disposeEv at crtstuff.c:?
     (inlined by) seastar::future<bool> seastar::future<>::then<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}, seastar::future<bool> >(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:917
     (inlined by) sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/index_reader.hh:775
     (inlined by) seastar::apply_helper<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#1}&&, std::tuple<>) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:35
     (inlined by) auto seastar::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:43
     (inlined by) seastar::future<bool> seastar::futurize<seastar::future<bool> >::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1392
     (inlined by) _ZZZN7seastar6futureIJEE9then_implIZN8sstables12index_reader34advance_lower_and_check_if_presentEN3dht18ring_position_viewESt8optionalI26position_in_partition_viewEEUlvE_NS0_IJbEEEEET0_OT_ENKUlvE_clEvENUlSE_E_clINS_12future_stateIJEEEEEDaSE_ at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:950
     (inlined by) _ZN7seastar12continuationIZZNS_6futureIJEE9then_implIZN8sstables12index_reader34advance_lower_and_check_if_presentEN3dht18ring_position_viewESt8optionalI26position_in_partition_viewEEUlvE_NS1_IJbEEEEET0_OT_ENKUlvE_clEvEUlSF_E_JEE15run_and_disposeEv at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:377
    seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) seastar::reactor::run() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:4243
    seastar::smp::configure(boost::program_options::variables_map, seastar::reactor_config)::{lambda()#3}::operator()() const at /usr/include/boost/program_options/variables_map.hpp:146
    std::function<void ()>::operator()() const at /usr/include/c++/8/bits/std_function.h:687
     (inlined by) seastar::posix_thread::start_routine(void*) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/posix.cc:52
    
    Aug 27 22:18:38 ip-10-0-10-203.eu-west-1.compute.internal scylla[31878]:  [shard 0] seastar - Failed to allocate 131072 bytes
    0x00000000041808b2
    0x000000000406d935
    0x000000000406dc35
    0x000000000406dce3
    0x00007f2e2d4c002f
    /opt/scylladb/libreloc/libc.so.6+0x000000000003853e
    /opt/scylladb/libreloc/libc.so.6+0x0000000000022894
    0x00000000040219aa
    0x00000000040235f4
    0x0000000004124fac
    0x0000000000cf0f7d
    0x0000000000cf1027
    0x00000000036ec4bb
    0x0000000004000a85
    0x00000000040030f4
    0x0000000001523df0
    0x0000000001581d82
    0x00000000015a5249
    0x00000000015a7e14
    0x0000000001094cdf
    0x000000000109798d
    0x000000000109872d
    0x000000000109b983
    0x000000000109d0d5
    0x00000000010c786f
    0x00000000010985c1
    0x000000000109b983
    0x000000000109d0d5
    0x00000000010e9aae
    0x00000000010eaa19
    0x0000000000e52db4
    0x000000000406ae21
    0x000000000406b01e
    0x000000000414d06d
    0x0000000003fd51d6
    0x0000000003fd6922
    0x00000000007d9d69
    /opt/scylladb/libreloc/libc.so.6+0x0000000000024412
    0x000000000083a1fd
    
    

    Translation:

    void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::backtrace_buffer::append_backtrace() at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) print_with_backtrace at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:1104
    seastar::print_with_backtrace(char const*) at /usr/include/boost/program_options/variables_map.hpp:146
    sigabrt_action at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5012
     (inlined by) _FUN at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5008
    ?? ??:0
    ?? ??:0
    ?? ??:0
    seastar::memory::on_allocation_failure(unsigned long) at memory.cc:?
    __libc_posix_memalign at ??:?
     (inlined by) posix_memalign at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/memory.cc:1601
    seastar::temporary_buffer<unsigned char>::aligned(unsigned long, unsigned long) at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) seastar::file::read_state<unsigned char>::read_state(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/file.hh:481
     (inlined by) seastar::shared_ptr_no_esft<seastar::file::read_state<unsigned char> >::shared_ptr_no_esft<unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&>(unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/shared_ptr.hh:164
     (inlined by) seastar::lw_shared_ptr<seastar::file::read_state<unsigned char> > seastar::lw_shared_ptr<seastar::file::read_state<unsigned char> >::make<unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&>(unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/shared_ptr.hh:266
     (inlined by) seastar::lw_shared_ptr<seastar::file::read_state<unsigned char> > seastar::make_lw_shared<seastar::file::read_state<unsigned char>, unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&>(unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/shared_ptr.hh:416
     (inlined by) seastar::posix_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:2352
    checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&)::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/sstring.hh:257
     (inlined by) auto do_io_check<checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&)::{lambda()#1}, , void>(std::function<void (std::__exception_ptr::exception_ptr)> const&, checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&)::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/disk-error-handler.hh:73
    checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/sstring.hh:257
    tracking_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/reader_concurrency_semaphore.cc:184
    seastar::future<seastar::temporary_buffer<char> > seastar::file::dma_read_bulk<char>(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/file.hh:421
     (inlined by) seastar::file_data_source_impl::issue_read_aheads(unsigned int)::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/fstream.cc:256
     (inlined by) seastar::future<seastar::temporary_buffer<char> > seastar::futurize<seastar::future<seastar::temporary_buffer<char> > >::apply<seastar::file_data_source_impl::issue_read_aheads(unsigned int)::{lambda()#1}>(seastar::file_data_source_impl::issue_read_aheads(unsigned int)::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/future.hh:1402
     (inlined by) seastar::file_data_source_impl::issue_read_aheads(unsigned int) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/fstream.cc:255
    seastar::file_data_source_impl::get() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/fstream.cc:173
    seastar::data_source::get() at /usr/include/c++/8/variant:1356
     (inlined by) seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::data_consume_rows_context_m> >(std::reference_wrapper<sstables::data_consume_rows_context_m>&&)::{lambda()#1}::operator()() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:206
     (inlined by) seastar::future<> seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::data_consume_rows_context_m> >(std::reference_wrapper<sstables::data_consume_rows_context_m>&&)::{lambda()#1}>(seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::data_consume_rows_context_m> >(std::reference_wrapper<sstables::data_consume_rows_context_m>&&)::{lambda()#1}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:285
    sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const at crtstuff.c:?
     (inlined by) seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context_m>(sstables::data_consume_rows_context_m&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:236
     (inlined by) data_consumer::continuous_data_consumer<sstables::data_consume_rows_context_m>::consume_input() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/consumer.hh:377
     (inlined by) sstables::data_consume_context<sstables::data_consume_rows_context_m>::read() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/data_consume_context.hh:98
     (inlined by) sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/partition.cc:479
     (inlined by) seastar::apply_helper<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#2}&&, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:35
     (inlined by) auto seastar::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:43
     (inlined by) seastar::future<> seastar::futurize<seastar::future<> >::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1392
     (inlined by) seastar::future<> seastar::future<>::then_impl<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}, seastar::future<> >(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:936
     (inlined by) seastar::future<> seastar::future<>::then<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}, seastar::future<> >(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:917
     (inlined by) sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/partition.cc:480
    seastar::future<> seastar::do_until<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}) at crtstuff.c:?
     (inlined by) seastar::future<> seastar::futurize<void>::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}&>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1385
     (inlined by) seastar::future<> seastar::do_until<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#1}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#1}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:507
     (inlined by) sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/partition.cc:481
     (inlined by) seastar::future<> seastar::do_void_futurize_helper<seastar::future<> >::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1359
     (inlined by) seastar::future<> seastar::futurize<void>::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1385
     (inlined by) seastar::future<> seastar::do_until<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:507
    sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at crtstuff.c:?
    flat_mutation_reader::impl::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) flat_mutation_reader::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.hh:337
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:308
     (inlined by) apply<mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)>, mutation_reader_merger::reader_and_last_fragment_kind&> at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1402
     (inlined by) futurize_apply<mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)>, mutation_reader_merger::reader_and_last_fragment_kind&> at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1474
     (inlined by) parallel_for_each<mutation_reader_merger::reader_and_last_fragment_kind*, mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:129
    parallel_for_each<utils::small_vector<mutation_reader_merger::reader_and_last_fragment_kind, 4>&, mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) mutation_reader_merger::prepare_next(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:307
    mutation_reader_merger::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
    mutation_fragment_merger<mutation_reader_merger>::fetch(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) mutation_fragment_merger<mutation_reader_merger>::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:120
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:489
    repeat<combined_mutation_reader::fill_buffer(seastar::lowres_clock::time_point)::<lambda()> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) combined_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:500
    flat_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.hh:391
     (inlined by) restricting_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda(flat_mutation_reader&)#1}::operator()(flat_mutation_reader&) const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:637
     (inlined by) _ZN27restricting_mutation_reader11with_readerIZNS_11fill_bufferENSt6chrono10time_pointIN7seastar12lowres_clockENS1_8durationIlSt5ratioILl1ELl1000EEEEEEEUlR20flat_mutation_readerE_EEDcT_S9_ at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:610
     (inlined by) restricting_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:641
    flat_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) mutation_reader_merger::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:384
    mutation_fragment_merger<mutation_reader_merger>::fetch(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) mutation_fragment_merger<mutation_reader_merger>::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:120
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:489
    repeat<combined_mutation_reader::fill_buffer(seastar::lowres_clock::time_point)::<lambda()> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
     (inlined by) combined_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:500
    flat_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /usr/include/c++/8/bits/unique_ptr.h:81
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.cc:681
    apply<flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()>&> at /usr/include/c++/8/bits/unique_ptr.h:81
     (inlined by) apply<flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()>&> at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1385
     (inlined by) do_until<flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()>, flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:507
     (inlined by) fill_buffer at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.cc:682
    _ZN7seastar8internal8repeaterIZZ19fragment_and_freeze20flat_mutation_readerSt8functionIFNS_6futureIJNS_10bool_classINS_18stop_iteration_tagEEEEEE15frozen_mutationbEEmENKUlRT_RT0_E_clIS2_28fragmenting_mutation_freezerEEDaSD_SF_EUlvE_E15run_and_disposeEv at frozen_mutation.cc:?
     (inlined by) flat_mutation_reader::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.hh:337
     (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/frozen_mutation.cc:259
     (inlined by) run_and_dispose at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:218
    seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
    seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
     (inlined by) seastar::reactor::run() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:4243
    seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/future.hh:768
    seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/future.hh:768
    main at crtstuff.c:?
    ?? ??:0
    _start at ??:?
    
    (CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
    corefile_urls=
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20758.1566948577000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20758.1566948577000000.gz.aa
    backtrace=           PID: 20758 (scylla)
               UID: 996 (scylla)
               GID: 1001 (scylla)
            Signal: 6 (ABRT)
         Timestamp: Tue 2019-08-27 23:29:37 UTC (1min 52s ago)
      Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
        Executable: /opt/scylladb/libexec/scylla
     Control Group: /
           Boot ID: 9f0393fe20f04dfab829e5bb5cc4bdad
        Machine ID: df877a200226bc47d06f26dae0736ec9
          Hostname: ip-10-0-178-144.eu-west-1.compute.internal
          Coredump: /var/lib/systemd/coredump/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20758.1566948577000000
           Message: Process 20758 (scylla) of user 996 dumped core.
                    
                    Stack trace of thread 20769:
                    #0  0x00007f179a95053f raise (libc.so.6)
                    #1  0x00007f179a93a95e abort (libc.so.6)
                    #2  0x0000000000469b8e _ZN8logalloc18allocating_section7reserveEv (scylla)
                    #3  0x00000000071c0d93 n/a (n/a)
    
    

    The other node's coredumps:

    (CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-1 [63.35.248.143 | 10.0.10.203] (seed: True)
    corefile_urls=
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.5160.1566943409000000.gz/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.5160.1566943409000000.gz.aa
    backtrace=           PID: 5160 (scylla)
               UID: 996 (scylla)
               GID: 1001 (scylla)
            Signal: 6 (ABRT)
         Timestamp: Tue 2019-08-27 22:03:29 UTC (1min 55s ago)
      Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
        Executable: /opt/scylladb/libexec/scylla
     Control Group: /
           Boot ID: 3f7c927968ca4130a5cfc4b02933017f
        Machine ID: df877a200226bc47d06f26dae0736ec9
          Hostname: ip-10-0-10-203.eu-west-1.compute.internal
          Coredump: /var/lib/systemd/coredump/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.5160.1566943409000000
           Message: Process 5160 (scylla) of user 996 dumped core.
                    
                    Stack trace of thread 5170:
                    #0  0x00007f742044b53f raise (libc.so.6)
                    #1  0x00007f742043595e abort (libc.so.6)
                    #2  0x00000000040219ab on_allocation_failure (scylla)
    

    Other download locaions:

    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.31878.1566944318000000.gz/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.31878.1566944318000000.gz.aa
    https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.32438.1566945161000000.gz/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.32438.1566945161000000.gz.aa
    
    

    relevant journalctl logs of the nodes can be found on scratch.scylladb.com/shlomib/longevity-large-partitions-4d-db-cluster.tar

    bug 
    opened by ShlomiBalalis 140
  • Significant fall down of operations per seconds during replace node

    Significant fall down of operations per seconds during replace node

    Installation details Scylla version (or git commit hash): 4.2.rc4-0.20200914.338196eab with build-id 7670ef1d82ff6b35783e1035d6544c7cc9abd90f Cluster size: 5 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-0bb0f15782d03eec3 (eu-north-1) instance type: i3.4xlarge

    During job https://jenkins.scylladb.com/view/scylla-4.2/job/scylla-4.2/job/longevity/job/longevity-mv-si-4days-test/5 several times nemesis TerminateAndReplace nemesis executed. This nemesis terminate instance of one node4 and after that it adds new node6. During adding new node for each nemesis execution operations per seconds jump down from 25k ops to 81 ops: Screenshot from 2020-09-17 18-32-33

    During second time node5 was terminated and and node8

    Screenshot from 2020-09-17 18-41-53

    monitoring node available: http://13.49.78.221:3000/d/N0wDzKdGk/scylla-per-server-metrics-nemesis-master?orgId=1&from=1600146424237&to=1600300730028&var-by=instance&var-cluster=&var-dc=All&var-node=All&var-shard=All&var-sct_tags=DatabaseLogEvent&var-sct_tags=DisruptionEvent

    Next c-s commands:

    2020-09-15 14:49:50.535: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_4mv_5queries.yaml ops'(insert=15,read1=1,read2=1,read3=1,read4=1,read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:50:20.510: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_4mv_5queries.yaml ops'(insert=15,read1=1,read2=1,read3=1,read4=1,read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:50:30.964: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2mv_2queries.yaml ops'(insert=6,mv_p_read1=1,mv_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:51:00.837: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2mv_2queries.yaml ops'(insert=6,mv_p_read1=1,mv_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:51:11.261: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_3si_5queries.yaml ops'(insert=25,si_read1=1,si_read2=1,si_read3=1,si_read4=1,si_read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:51:41.228: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_3si_5queries.yaml ops'(insert=25,si_read1=1,si_read2=1,si_read3=1,si_read4=1,si_read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:51:51.676: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2si_2queries.yaml ops'(insert=10,si_p_read1=1,si_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    2020-09-15 14:52:21.540: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
    stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2si_2queries.yaml ops'(insert=10,si_p_read1=1,si_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
    

    next schema generated:

    CREATE KEYSPACE mview WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
    
    CREATE TABLE mview.users (
        userid bigint PRIMARY KEY,
        address text,
        email text,
        first_name text,
        initials int,
        last_access timeuuid,
        last_name text,
        password text,
        userdata blob
    ) WITH bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_first_name AS
        SELECT first_name, userid, email
        FROM mview.users
        WHERE first_name IS NOT null AND userid IS NOT null
        PRIMARY KEY (first_name, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_initials AS
        SELECT initials, userid
        FROM mview.users
        WHERE initials IS NOT null AND userid IS NOT null
        PRIMARY KEY (initials, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_email AS
        SELECT email, userid
        FROM mview.users
        WHERE email IS NOT null AND userid IS NOT null
        PRIMARY KEY (email, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_password AS
        SELECT password, userid
        FROM mview.users
        WHERE password IS NOT null AND userid IS NOT null
        PRIMARY KEY (password, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_last_name AS
        SELECT last_name, userid, email
        FROM mview.users
        WHERE last_name IS NOT null AND userid IS NOT null
        PRIMARY KEY (last_name, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW mview.users_by_address AS
        SELECT address, userid
        FROM mview.users
        WHERE address IS NOT null AND userid IS NOT null
        PRIMARY KEY (address, userid)
        WITH CLUSTERING ORDER BY (userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE KEYSPACE keyspace1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '5'}  AND durable_writes = true;
    
    CREATE TABLE keyspace1.standard1 (
        key blob PRIMARY KEY,
        "C0" blob,
        "C1" blob,
        "C2" blob,
        "C3" blob,
        "C4" blob,
        aqpq3qgcom list<frozen<set<timestamp>>>,
        b69k9r389z list<frozen<map<frozen<map<frozen<map<bigint, timeuuid>>, frozen<map<bigint, bigint>>>>, frozen<set<tinyint>>>>>,
        bdbs5ixqdq map<frozen<map<frozen<set<inet>>, frozen<set<frozen<list<date>>>>>>, frozen<list<frozen<set<decimal>>>>>,
        f4xwkb2zcm set<frozen<map<frozen<set<timestamp>>, frozen<map<timeuuid, inet>>>>>,
        fywh69a04j set<frozen<map<frozen<set<varint>>, frozen<set<frozen<map<frozen<map<smallint, inet>>, varint>>>>>>>,
        hacdvjo18p set<frozen<list<frozen<map<smallint, bigint>>>>>,
        iopuqysiqf list<frozen<map<frozen<set<frozen<list<text>>>>, frozen<map<frozen<set<ascii>>, ascii>>>>>,
        jxu8tsm8v5 set<frozen<map<frozen<list<blob>>, frozen<map<frozen<list<text>>, frozen<map<int, smallint>>>>>>>,
        ki1u5t67nf set<frozen<set<ascii>>>,
        l8pw46826p list<frozen<map<frozen<list<date>>, frozen<list<frozen<map<ascii, double>>>>>>>,
        oj5epbs4pn list<frozen<set<frozen<map<smallint, int>>>>>,
        ortj1um8mc set<frozen<list<frozen<map<float, double>>>>>,
        p8v0kjmfsr list<frozen<list<varint>>>,
        rulnhv7azy set<frozen<set<frozen<set<float>>>>>,
        si5zsclur2 map<frozen<map<frozen<list<bigint>>, boolean>>, frozen<set<frozen<set<float>>>>>,
        v3p7qqv1vn list<frozen<list<bigint>>>,
        wyhqruomlw map<frozen<map<frozen<set<int>>, frozen<set<smallint>>>>, frozen<set<frozen<list<frozen<map<ascii, int>>>>>>>,
        yskieerio3 set<frozen<list<frozen<map<frozen<list<timeuuid>>, frozen<set<decimal>>>>>>>
    ) WITH bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    
    CREATE KEYSPACE sec_index WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
    
    CREATE TABLE sec_index.users (
        userid bigint PRIMARY KEY,
        address text,
        email text,
        first_name text,
        initials int,
        last_access timeuuid,
        last_name text,
        password text,
        userdata blob
    ) WITH bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 4678
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    CREATE INDEX users_initials_ind ON sec_index.users (initials);
    CREATE INDEX users_last_name_ind ON sec_index.users (last_name);
    CREATE INDEX users_last_access_ind ON sec_index.users (last_access);
    CREATE INDEX users_first_name_ind ON sec_index.users (first_name);
    CREATE INDEX users_address_ind ON sec_index.users (address);
    
    CREATE MATERIALIZED VIEW sec_index.users_address_ind_index AS
        SELECT address, idx_token, userid
        FROM sec_index.users
        WHERE address IS NOT NULL
        PRIMARY KEY (address, idx_token, userid)
        WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW sec_index.users_first_name_ind_index AS
        SELECT first_name, idx_token, userid
        FROM sec_index.users
        WHERE first_name IS NOT NULL
        PRIMARY KEY (first_name, idx_token, userid)
        WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW sec_index.users_initials_ind_index AS
        SELECT initials, idx_token, userid
        FROM sec_index.users
        WHERE initials IS NOT NULL
        PRIMARY KEY (initials, idx_token, userid)
        WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW sec_index.users_last_access_ind_index AS
        SELECT last_access, idx_token, userid
        FROM sec_index.users
        WHERE last_access IS NOT NULL
        PRIMARY KEY (last_access, idx_token, userid)
        WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    
    CREATE MATERIALIZED VIEW sec_index.users_last_name_ind_index AS
        SELECT last_name, idx_token, userid
        FROM sec_index.users
        WHERE last_name IS NOT NULL
        PRIMARY KEY (last_name, idx_token, userid)
        WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
        AND comment = ''
        AND compaction = {'class': 'SizeTieredCompactionStrategy'}
        AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = '99.0PERCENTILE';
    

    No reactor stalls were detected during this.

    All db logs: https://cloudius-jenkins-test.s3.amazonaws.com/ca850009-fb1d-4d43-ac60-0fdbce75cc71/20200916_203618/db-cluster-ca850009.zip

    bug repair-based-operations 
    opened by aleksbykov 132
  • Performance regression of 780% in p99th latency compared to 2.2.0 for 100% read test

    Performance regression of 780% in p99th latency compared to 2.2.0 for 100% read test

    Installation details Scylla version (or git commit hash): 2.3.rc0-0.20180722.a77bb1fe3 Cluster size: 3 OS (RHEL/CentOS/Ubuntu/AWS AMI): AWS AMI (ami-905252ef) instance type: i3.4xlarge

    test_latency_read results showing 780% regression in p99th latency compared to 2.2.0:

    Version | Op rate total | Latency mean | Latency 99th percentile -- | -- | -- | -- 2.2.0 |  39997.0 [2018-07-19 10:26:37] | 1.4 [2018-07-19 10:26:37] |  3.1 [2018-07-19 10:26:37] 2.3.0 | 37200.0 (6% Regression) | 8.2 (485% Regression) | 27.3 (780% Regression)

    2.3.0 p99th latency looks abnormal and reaches peaks of ~400ms: screen shot 2018-07-25 at 1 26 42

    Test is populating 1TB of data and then start a c-s read command: cassandra-stress read no-warmup cl=QUORUM duration=50m -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate 'threads=100 limit=10000/s' -errors ignore -col 'size=FIXED(1024) n=FIXED(1)' -pop 'dist=gauss(1..1000000000,500000000,50000000)' (During the first part of the test we can still see compactions that are leftovers of the write population)

    Full screenshot: screencapture-34-230-6-17-3000-dashboard-db-scylla-per-server-metrics-nemesis-master-test-latency-2-3-2018-07-25-01_31_03

    bug performance Regression 
    opened by roydahan 127
  • Some shards get stuck in tight loop during repair

    Some shards get stuck in tight loop during repair

    This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

    • [x] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

    Installation details Scylla version (or git commit hash): 5.0.1 Cluster size: 5 OS (RHEL/CentOS/Ubuntu/AWS AMI): Ubuntu 20.04

    Hardware details (for performance issues) Delete if unneeded Platform (physical/VM/cloud instance type/docker): Hetzner Hardware: sockets=1 cores=4 hyperthreading=8 memory=64G Disks: 2x SSD in RAID1

    A few shards on one of my nodes got stuck in a tight loop while running a repair operation. It has been going for a day and is not making any progress. All the while CPU usage on three cores is stuck at 100%: image

    Restarting the node also hangs, until it eventually gets killed by systemd. When the node restarts the same shards get stuck again shortly after initialization.

    I have exported logs for the node that is getting stuck: https://pixeldrain.com/u/rwtFViqp And the repair master: https://pixeldrain.com/u/PjML3Y3F

    Some of my data has become inconsistent after some downtime and I have no way to repair it now.. please help.

    bug User Request 
    opened by Fornax96 117
  • test_latency_mixed_with_nemesis - latency during

    test_latency_mixed_with_nemesis - latency during "steady state" get to 20 ms without heavy stalls

    Installation details Scylla version (or git commit hash): 4.4.dev-0.20210114.32fd38f34 with build-id 0642bb3b142094f1092b0d276f6efa858081fe96 Cluster size: 3 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-012cafbb2dc4f1e4d (eu-west-1)

    running mixed workload with the command: cassandra-stress mixed no-warmup cl=QUORUM duration=350m -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate 'threads=50 throttle=3500/s' -col 'size=FIXED(128) n=FIXED(8)' -pop 'dist=gauss(1..250000000,125000000,12500000)'

    during the steady state, the only stalls detected were:

    2021-01-15T06:40:39+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO    | scylla: Reactor stalled for 6 ms on shard 5.
    2021-01-15T06:48:16+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-3 !INFO    | scylla: Reactor stalled for 6 ms on shard 5.
    2021-01-15T06:51:27+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO    | scylla: Reactor stalled for 6 ms on shard 4.
    2021-01-15T06:58:25+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO    | scylla: Reactor stalled for 6 ms on shard 6.
    2021-01-15T06:59:50+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO    | scylla: Reactor stalled for 6 ms on shard 7.
    2021-01-15T07:07:13+00:00  perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO    | scylla: Reactor stalled for 6 ms on shard 5.
    

    the values for the steady state latency are: Metric name | Metric value -----------------| ------------------ "c-s P95" | "5.40" "c-s P99" |"19.10" "Scylla P99_read - node-3" | "19.20" "Scylla P99_write - node-1" | "13.76" "Scylla P99_read - node-2" | "23.66" "Scylla P99_write - node-2" | "13.79" "Scylla P99_read - node-1" | "23.55" "Scylla P99_write - node-3" | "1.56"

    there is a live monitor here

    here is a live snapshot (if monitor dies)

    from the monitor, we can see: c-s latency Screenshot from 2021-01-28 15-30-08 (copy)

    c-s_max

    and Scylla latency: read_99th

    write_99th

    comparing with the original document where we checked these values, we have: for Scylla 4.1:

    Metric name | read value | write value -----------------|----------------|--------------- Mean | 0.9 ms | 0.4 ms P95 | 7.8 ms | 1.4 ms P99 | 48.2 ms | 2.5 ms Max | 71 ms | 71 ms

    for Scylla 666.development-0.20200910.02ee0483b: Metric name | read value | write value -----------------|----------------|--------------- Mean | 0.7 ms | 0.3 ms P95 | 3.6 ms | 0.9 ms P99 | 6 ms | 1.2 ms Max | 16.8 ms | 16.8 ms

    all the nodes logs can be downloaded here

    even c-s 95th is too high for a steady state time: c-s_95th

    bug latency 
    opened by fgelcer 116
  • Permanent read/write fails after

    Permanent read/write fails after "bad_alloc"

    This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

    • [*] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

    Installation details Scylla version (or git commit hash): 3.2.4 Cluster size: 5+5 (multi DC) OS (RHEL/CentOS/Ubuntu/AWS AMI): C7.5

    Platform (physical/VM/cloud instance type/docker): bare metal Hardware: sockets=2 x Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz cores=40 hyperthreading=yes memory= 6x32GB DDR4 2666MHz Disks: RAID 10 of 10HDDs 14TB each for data, RAID 1 SSD 1TB for clogs

    Hi!

    The problem started with errors like "exception during mutation write to 10.161.180.24: std::bad_alloc (std::bad_alloc)" and led to one shard constantly failing a lot of (probably all) write/read operations until scylla-server was manually restarted. I guess that this can be due to having large partitions so here is what we have on that (we have 2 CFs):

    becca/events histograms
    Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                                  (micros)          (micros)           (bytes)
    50%             2.00             16.00          47988.50               770                29
    75%             2.00             20.00          79061.50              5722               215
    95%             6.00             33.00         185724.05             88148              2759
    98%             8.00             36.00         239365.28            182785              5722
    99%            10.00             46.73         295955.11            263210              8239
    Min             0.00              1.00             20.00                73                 2
    Max            24.00          29492.00        2051039.00         464228842           5839588
    
    becca/events_by_ip histograms
    Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                                  (micros)          (micros)           (bytes)
    50%             0.00             16.00              0.00              6866               179
    75%             0.00             19.75              0.00             29521               770
    95%             0.00             33.00              0.00            315852              8239
    98%             0.00             41.00              0.00            785939             20501
    99%             0.00             48.43              0.00           1629722             42510
    Min             0.00              1.00              0.00                73                 0
    Max             0.00          19498.00              0.00         386857368           4866323
    

    Anyway if some big query arrived and failed I do not quite understand why all subsequent queries failed until the node was restarted.

    Logs: https://cloud.mail.ru/public/C3AZ/RxPZyKUV6

    Dashboard (by shard)

    Снимок экрана 2020-05-09 в 20 11 38 Снимок экрана 2020-05-09 в 17 41 49 bug User Request hinted-handoff bad_alloc 
    opened by gibsn 114
  • Cassandra Stress times out: BusyPoolException: no available connection and timed out after 5000 MILLISECONDS / using shard-aware driver, get the 1tb longevity to overload

    Cassandra Stress times out: BusyPoolException: no available connection and timed out after 5000 MILLISECONDS / using shard-aware driver, get the 1tb longevity to overload

    Installation details Scylla version (or git commit hash): 4.3.rc2-0.20201126.bc922a743 with build-id 840fd4b3f6304765c03e886269b1c2550bf23e53 Cluster size: 4 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-09f30667ba6e09e9b (eu-west-1) Scenario: 1tb-7days

    Half an hour into the stress' run, at 15:15, a consistent BusyPoolException from three of the four nodes, which continued throughout the entire remaining run of the stress:

    15:15:22.497 [Thread-641] DEBUG c.d.driver.core.RequestHandler - [1227134168-0] Error querying 10.0.0.5/10.0.0.5:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.0.5/10.0.0.5:9042] Pool is busy (no available connection and the queue has reached its max size 256)
    ...
    15:32:59.650 [cluster1-nio-worker-21] DEBUG c.d.driver.core.RequestHandler - [540726118-0] Error querying 10.0.0.5/10.0.0.5:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.0.5/10.0.0.5:9042] Pool is busy (no available connection and timed out after 5000 MILLISECONDS)
    
    15:25:50.717 [Thread-177] DEBUG c.d.driver.core.RequestHandler - [544250492-0] Error querying 10.0.3.37/10.0.3.37:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.3.37/10.0.3.37:9042] Pool is busy (no available connection and the queue has reached its max size 256)
    ...
    15:32:59.638 [cluster1-nio-worker-29] DEBUG c.d.driver.core.RequestHandler - [640744570-0] Error querying 10.0.1.149/10.0.1.149:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.1.149/10.0.1.149:9042] Pool is busy (no available connection and timed out after 5000 MILLISECONDS)
    
    15:32:59.638 [cluster1-nio-worker-29] DEBUG c.d.driver.core.RequestHandler - [640744570-0] Error querying 10.0.1.149/10.0.1.149:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.1.149/10.0.1.149:9042] Pool is busy (no available connection and timed out after 5000 MILLISECONDS)
    

    At the same time, the stress experienced consistent WriteTimeoutException, since the stress failed to achieve quorum:

    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)
    com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)
    
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.1.149/10.0.1.149:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.1.149/10.0.1.149:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.1.149/10.0.1.149:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.3.37/10.0.3.37:9042] Timed out waiting for server response
    com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.3.37/10.0.3.37:9042] Timed out waiting for server response
    

    At 16:18, the stress starts to experience EMPTY RESULT errors:

    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.3.37:9042-7, inFlight=128, closed=false] Response received on stream 27584 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.3.37:9042-7, inFlight=128, closed=false] Response received on stream 27648 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.3.37:9042-7, inFlight=128, closed=false] Response received on stream 27712 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.0.5:9042-11, inFlight=128, closed=false] Response received on stream 32640 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.0.5:9042-11, inFlight=128, closed=false] Response received on stream 32704 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.0.5:9042-11, inFlight=128, closed=false] Response received on stream 0 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.149:9042-4, inFlight=128, closed=false] Response received on stream 16320 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.149:9042-4, inFlight=128, closed=false] Response received on stream 16384 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.149:9042-4, inFlight=128, closed=false] Response received on stream 16448 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
    

    Weirdly enough, node#4 ,10.0.1.77, does not seem to experience any timeouts. In fact, the messages in the stress' log I see in that time period are healthy heartbeat messages:

    14:42:15.661 [cluster1-nio-worker-3] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.77:9042-2, inFlight=1, closed=false] Keyspace set to keyspace1
    16:19:32.899 [cluster1-reconnection-0] DEBUG com.datastax.driver.core.Host.STATES - [10.0.1.77/10.0.1.77:9042] preparing to open 1 new connections, total = 15
    16:19:32.901 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] Connection established, initializing transport
    16:19:32.937 [cluster1-nio-worker-17] DEBUG c.d.s.netty.handler.ssl.SslHandler - [id: 0x14eb560e, L:/10.0.1.115:48940 - R:10.0.1.77/10.0.1.77:9042] HANDSHAKEN: TLS_RSA_WITH_AES_128_CBC_SHA
    16:19:41.082 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Host.STATES - [10.0.1.77/10.0.1.77:9042] Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] Transport initialized, connection ready
    16:20:03.838 [cluster1-reconnection-0] DEBUG com.datastax.driver.core.Host.STATES - [Control connection] established to 10.0.1.77/10.0.1.77:9042
    16:20:33.809 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
    16:20:41.918 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded
    16:21:11.926 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
    16:21:13.881 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded
    16:21:43.882 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
    16:21:48.369 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded
    16:22:18.373 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
    16:22:22.816 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded
    

    Screenshot from 2020-12-03 13-44-38

    Screenshot from 2020-12-03 14-04-38

    From the looks of the metrics of both foreground and background writes per instance, it seems that node#4 indeed receives less writes than any other node. Perhaps it's possible that this fact caused the inflight hints messages of the other nodes to fill up, considering that in the previous errors nodes 1-3 reported that inFlight=128? Perhaps there is an issue with key distribution between the nodes, which caused the other nodes to receive more stress than they could have handled.

    The failed stress command:

    cassandra-stress write cl=QUORUM n=1100200300 -schema 'replication(factor=3) compaction(strategy=LeveledCompactionStrategy)' -port jmx=6868 -mode cql3 native -rate threads=1000 -col 'size=FIXED(200) n=FIXED(5)' -pop seq=1..1100200300
    

    Other prepare stresses for this run:

    cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=LZ4Compressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=lz4 -rate threads=50 -pop seq=1..50000000 -log interval=5
    cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=SnappyCompressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=snappy -rate threads=50 -pop seq=1..50000000 -log interval=5
    cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=DeflateCompressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=none -rate threads=50 -pop seq=1..50000000 -log interval=5
    cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=ZstdCompressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=none -rate threads=50 -pop seq=1..50000000 -log interval=5
    

    (Each of them runs once, spread across 2 loaders)

    Node list:

    longevity-tls-1tb-7d-4-3-db-node-66a319cd-1 [34.243.3.190 | 10.0.1.149]
    longevity-tls-1tb-7d-4-3-db-node-66a319cd-2 [54.246.50.198 | 10.0.0.5] 
    longevity-tls-1tb-7d-4-3-db-node-66a319cd-3 [54.247.54.152 | 10.0.3.37]
    longevity-tls-1tb-7d-4-3-db-node-66a319cd-4 [52.211.7.163 | 10.0.1.77] 
    

    Logs:

    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    |                                                                                            Log links for testrun with test id 66a319cd-223d-450b-8f0f-2bb423d39693                                                                                            |
    +-----------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Date            | Log type    | Link                                                                                                                                                                                                                          |
    +-----------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | 20190101_010101 | prometheus  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/prometheus_snapshot_20201202_164129.tar.gz                                                                                                |
    | 20201202_163157 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_163157/grafana-screenshot-overview-20201202_163158-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png                          |
    | 20201202_163157 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_163157/grafana-screenshot-scylla-per-server-metrics-nemesis-20201202_163545-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png |
    | 20201202_164145 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_164145/grafana-screenshot-overview-20201202_164145-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png                          |
    | 20201202_164145 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_164145/grafana-screenshot-scylla-per-server-metrics-nemesis-20201202_164500-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png |
    | 20201202_165046 | db-cluster  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/db-cluster-66a319cd.zip                                                                                                   |
    | 20201202_165046 | loader-set  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/loader-set-66a319cd.zip                                                                                                   |
    | 20201202_165046 | monitor-set | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/monitor-set-66a319cd.zip                                                                                                  |
    | 20201202_165046 | sct-runner  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/sct-runner-66a319cd.zip                                                                                                   |
    +-----------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    

    To start the monitor using hydra:

    hydra investigate show-monitor 66a319cd-223d-450b-8f0f-2bb423d39693
    
    longevity overload 
    opened by ShlomiBalalis 109
  • some non-prepared statements can leak memory (with set/map/tuple/udt literals)

    some non-prepared statements can leak memory (with set/map/tuple/udt literals)

    This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

    • [*] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

    Installation details Scylla version: 4.0.4 Cluster size: 10 nodes, 4 shards per node OS: Ubuntu

    After running ok for a few days, nodes consistently start having 'bad_alloc' errors, even though we do not have a lot of Data files (~1850 Data files) and our data size (400G per node) is not that great comparing to the memory available to the node (90G for 4 shards, so about 22G per shard).

    Aug 11 04:31:13 fr-eqx-scylla-04 scylla[10177]: WARN 2020-08-11 04:31:13,007 [shard 0] storage_proxy - Failed to apply mutation from 192.168.96.47#0: std::bad_alloc (std::bad_alloc)

    Our non_lsa memory is always growing and at some point it just start having bad_alloc once it reaches a level:

    2020-08-11_non_lsa_scylla-04_shards

    bug User Request bad_alloc 
    opened by withings-sas 104
  • configure: don't reduce parsers' optimization level to 1 in release

    configure: don't reduce parsers' optimization level to 1 in release

    The line modified in this patch was supposed to increase the optimization levels of parsers in debug mode to 1, because they were too slow otherwise. But as a side effect, it also reduced the optimization level in release mode to 1. This is not a problem for the CQL frontend, because statement preparation is not performance-sensitive, but it is a serious performance problem for Alternator, where it lies in the hot path.

    Fix this by only applying the -O1 to debug modes.

    opened by michoecho 3
  • repair: finish repair immediately on local keyspaces

    repair: finish repair immediately on local keyspaces

    Keyspaces with local replication strategy do not need to be repaired. Thus, for such keyspaces repair_service::do_repair_start returns 0 (instead of repair's sequence number) immediately.

    opened by Deexie 3
  • Timeout point send in `forward_request` verb comes from `seastar::lowres_clock`

    Timeout point send in `forward_request` verb comes from `seastar::lowres_clock`

    In the parallelized aggregation layer query::forward_request (that is sent to remote nodes) carries information about timeouts using lowres_clock::time_point (that came from local seastar::lowres_clock). The lowres_clock::time_point is later used on remote nodes to designate a timeout date. This is wrong and may lead to delayed or premature timeout.

    bug P2 
    opened by havaker 1
  • test: test_topology: make test_nodes_with_different_smp less hacky

    test: test_topology: make test_nodes_with_different_smp less hacky

    The test would use a trick to start a separate Scylla cluster from the one provided originally by the test framework. This is not supported by the test framework and may cause unexpected problems.

    Change the test to perform regular node operations. Instead of starting a fresh cluster of 3 nodes, we join the first of these nodes to the original framework-provided cluster, then decommission the original nodes, then bootstrap the other 2 fresh nodes.

    Also add some logging to the test.

    Refs: #12438, #12442

    opened by kbr-scylla 4
  • doc: replace Scylla with ScyllaDB on the menu tree and major links

    doc: replace Scylla with ScyllaDB on the menu tree and major links

    This PR replaces "Scylla" with "ScyllaDB" in the most exposed places, including the left menu panel (this involves updating the page titles) and links on the index pages. This PR partly satisfies the requirements of https://github.com/scylladb/scylla-docs/issues/3962.

    If a page or section title is updated, the markup must also be updated (it must be at least as long as the title text). Example:

    OK:

    ScyllaDB Hinted Handoff
    =======================
    

    OK:

    ScyllaDB Hinted Handoff
    ===========================
    

    WRONG:

    ScyllaDB Hinted Handoff
    ==============
    
    Documentation 
    opened by annastuchlik 0
  • tools: toolchain: drop s390x from prepare script architecture list

    tools: toolchain: drop s390x from prepare script architecture list

    It's been a long while since we built ScyllaDB for s390x, and in fact the last time I checked it was broken on the ragel parser generator generating bad source files for the HTTP parser. So just drop it from the list.

    I kept s390x in the architecture mapping table since it's still valid.

    opened by avikivity 1
Owner
ScyllaDB
ScyllaDB, the The Real-Time Big Data Database
ScyllaDB
Apache Druid: a high performance real-time analytics database.

Website | Documentation | Developer Mailing List | User Mailing List | Slack | Twitter | Download Apache Druid Druid is a high performance real-time a

The Apache Software Foundation 12.3k Jan 2, 2023
Apache HBase

Apache HBase [1] is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Str

The Apache Software Foundation 4.7k Jan 7, 2023
Infinispan is an open source data grid platform and highly scalable NoSQL cloud data store.

The Infinispan project Infinispan is an open source (under the Apache License, v2.0) data grid platform. For more information on Infinispan, including

Infinispan 1k Dec 31, 2022
Flink Table Store is a unified streaming and batch store for building dynamic tables on Apache Flink

Flink Table Store is a unified streaming and batch store for building dynamic tables on Apache Flink

The Apache Software Foundation 366 Jan 1, 2023
Mirror of Apache Cassandra

Apache Cassandra Apache Cassandra is a highly-scalable partitioned row store. Rows are organized into tables with a required primary key. Partitioning

The Apache Software Foundation 7.7k Jan 1, 2023
Evgeniy Khyst 54 Dec 28, 2022
Time Series Metrics Engine based on Cassandra

Hawkular Metrics, a storage engine for metric data About Hawkular Metrics is the metric data storage engine part of Hawkular community. It relies on A

Hawkular 230 Dec 9, 2022
Spring MSA api gateway & service discovery with consul & Jaeger & Cassandra

Spring-Cloud-MSA 준비 Cassandra 서버를 준비한다 table.sql 파일로 keyspace와 테이블을 만들어 둔다 Consul 1.11.1버전 기준 https://www.consul.io/downloads 에서 1.11.1 버전 운영체제 맞게 다운

INSUNG CHOI 2 Nov 22, 2022
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark applications to store shuffle data on remote servers

What is Firestorm Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark applications to store shuffle data on remote ser

Tencent 246 Nov 29, 2022
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

Apache Gobblin Apache Gobblin is a highly scalable data management solution for structured and byte-oriented data in heterogeneous data ecosystems. Ca

The Apache Software Foundation 2.1k Jan 4, 2023
An API Library that provides the functionality to access, manage and store device topologies found in JSON files using Java and Maven Framework

Topology API ?? About An API library which provides the functionality to access, manage and store device topologies. ?? Description Read a topology fr

Abdelrahman Hamdy 2 Aug 4, 2022
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time.

About CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time. CrateDB offers the

Crate.io 3.6k Jan 2, 2023
A distributed in-memory data store for the cloud

EVCache EVCache is a memcached & spymemcached based caching solution that is mainly used for AWS EC2 infrastructure for caching frequently used data.

Netflix, Inc. 1.9k Jan 2, 2023
New-fangled Timeseries Data Store

Newts Newts is a time-series data store based on Apache Cassandra. Features High throughput Newts is built upon Apache Cassandra, a write-optimized, f

OpenNMS 190 Oct 3, 2022
Cosmic Ink is a transcript application which was built with the help of Symbl AI and At Sign platform for back-end to store our data and authenticate

Cosmic-Ink Cosmic Ink is a transcript application which was built with the help of Symbl AI and At Sign platform for back-end to store our data and au

Venu Sai Madisetti 4 Dec 1, 2022