Norconex/crawlers

Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.

GitHub repository with 202 stars and 70 forks.

Language: Java

Topics: search-engine, web-crawler, java, collector-http, flexible, crawler, crawlers, filesystem-crawler, collector-fs

Open provider repository

Latest metric snapshot

2026-06-05: 202 stars and 70 forks.

Similar repositories

  1. 1. elastic/elasticsearch

    Free and Open Source, Distributed, RESTful Search Engine

    GitHub repository with 76,861 stars and 25,967 forks.

    Trending score: 1.38; stars gained: +27; forks gained: +6.

    Language: Java

    Topics: elasticsearch, java, search-engine

  2. 2. opensearch-project/OpenSearch

    🔎 Open source distributed and RESTful search engine.

    GitHub repository with 13,078 stars and 2,602 forks.

    Trending score: 1.07; stars gained: +12; forks gained: +4.

    Language: Java

    Topics: analytics, apache2, foss, java, search, search-engine

  3. 3. apache/lucene

    Apache Lucene open-source search software

    GitHub repository with 3,442 stars and 1,364 forks.

    Trending score: 0.29; stars gained: -1; forks gained: +0.

    Language: Java

    Topics: backend, information-retrieval, java, lucene, nosql, search

  4. 4. codelibs/fess

    Fess is very powerful and easily deployable Enterprise Search Server.

    GitHub repository with 1,111 stars and 174 forks.

    Trending score: 0.09; stars gained: +0; forks gained: +0.

    Language: Java

    Topics: java, elasticsearch, full-text-search, lucene, crawler, enterprise-search

  5. 5. apache/solr

    Apache Solr open-source search software

    GitHub repository with 1,622 stars and 829 forks.

    Trending score: 0.05; stars gained: +0; forks gained: +0.

    Language: Java

    Topics: lucene, solr, search, nosql, java, backend

Trending in Java

  1. 1. fish2018/webhtv

    WebHomeTV 基于FongMi二次开发,增强了 WebHome 自定义首页、App Native SDK、网盘链接检测 和 Nostr推荐首页。 这个项目的核心目标是让 CSP 站点首页可以变成一个真正可开发的网页应用:开发者可以用 HTML/CSS/JavaScript 定制首页,再通过 App 暴露的 Native 能力完成搜索、播放、跨域请求、资源代理、最近观看、网盘检测和状态同步。

    GitHub repository with 386 stars and 108 forks.

    Trending score: 3.29; stars gained: +83; forks gained: +16.

    Language: Java

  2. 2. juanjuandog/FinSight-AI

    AI equity research agent with resilient workflows, Redis Lua single-flight, pgvector RAG, versioned reports, evidence tracing, and RAG evaluation.

    GitHub repository with 1,003 stars and 58 forks.

    Trending score: 3.24; stars gained: +77; forks gained: +1.

    Language: Java

    Topics: ai-agent, financial-research, llm-evaluation, pgvector, postgresql, rabbitmq

  3. 3. apache/kafka

    Apache Kafka - A distributed event streaming platform

    GitHub repository with 32,714 stars and 15,251 forks.

    Trending score: 2.36; stars gained: +10; forks gained: +7.

    Language: Java

    Topics: java, kafka, scala, streaming

  4. 4. spring-projects/spring-ai

    An Application Framework for AI Engineering

    GitHub repository with 8,871 stars and 2,616 forks.

    Trending score: 2.31; stars gained: +22; forks gained: +5.

    Language: Java

    Topics: artificial-intelligence, hacktoberfest, java, spring-ai

  5. 5. JetBrains/intellij-community

    IntelliJ IDEA & IntelliJ Platform

    GitHub repository with 20,185 stars and 5,925 forks.

    Trending score: 2.29; stars gained: +16; forks gained: +4.

    Language: Java

    Topics: code-editor, ide, intellij, intellij-community, intellij-platform

  6. 6. apache/beam

    Apache Beam is a unified programming model for Batch and Streaming data processing.

    GitHub repository with 8,605 stars and 4,576 forks.

    Trending score: 2.18; stars gained: +5; forks gained: +4.

    Language: Java

    Topics: batch, beam, big-data, golang, java, python

Trending topic: search-engine

  1. 1. quickwit-oss/tantivy

    Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

    GitHub repository with 15,337 stars and 919 forks.

    Trending score: 1.49; stars gained: +24; forks gained: +1.

    Language: Rust

    Topics: search-engine, rust

  2. 2. lancedb/lancedb

    Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.

    GitHub repository with 10,507 stars and 889 forks.

    Trending score: 1.40; stars gained: +16; forks gained: +0.

    Language: HTML

    Topics: approximate-nearest-neighbor-search, image-search, nearest-neighbor-search, recommender-system, search-engine, semantic-search

  3. 3. elastic/elasticsearch

    Free and Open Source, Distributed, RESTful Search Engine

    GitHub repository with 76,861 stars and 25,967 forks.

    Trending score: 1.38; stars gained: +27; forks gained: +6.

    Language: Java

    Topics: elasticsearch, java, search-engine

  4. 4. weaviate/weaviate

    Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.

    GitHub repository with 16,277 stars and 1,306 forks.

    Trending score: 1.26; stars gained: +9; forks gained: +0.

    Language: Go

    Topics: search-engine, semantic-search, semantic-search-engine, vector-search, vector-search-engine, vector-database

  5. 5. opensearch-project/OpenSearch

    🔎 Open source distributed and RESTful search engine.

    GitHub repository with 13,078 stars and 2,602 forks.

    Trending score: 1.07; stars gained: +12; forks gained: +4.

    Language: Java

    Topics: analytics, apache2, foss, java, search, search-engine

  6. 6. karust/openserp

    Open-source SERP API for AI, SEO & automation - Google, Yandex, Baidu, Bing, DuckDuckGo, Ecosia 🎉

    GitHub repository with 745 stars and 102 forks.

    Trending score: 0.88; stars gained: +7; forks gained: +1.

    Language: Go

    Topics: ai, baidu, bing, duckduckgo, ecosia, google