CVPR·ICCV·ECCV·WACV·BMVC 신규 논문을 매일 추적

총 40편 · 코드 공개 9편 · 출처 arXiv cs.CV

ECCV 2026

JanusMesh: Fast and Zero-Shot 3D Visual Illusion Generation via Cross-Space Denoising

Creating 3D visual illusions, a single 3D mesh that reveals entirely different semantics from various viewing angles, is a fascinating but tough challenge. Existing optimization-based methods are slow and can produce ove…

2026년 6월 18일

더 많은 논문

TimeProVe: Propose, then Verify for Efficient Long Video Temporal Reasoning in Activities of Daily Living

2026년 6월 18일

Long Video Question Answering (LVQA) requires identifying sparse, query-relevant evidence within hours-long untrimmed videos. Existing approaches either process…

cs.CV

UNIEGO: Proxies as Mediators for Unified Egocentric Video Representation Learning

2026년 6월 18일

Egocentric video understanding is inherently limited by the narrow perspective of wearable cameras: a single viewpoint, a single modality, a single model cannot…

cs.CVcs.LG

Thinking in Boxes: 3D Editing in Real Images Made Easy

2026년 6월 18일

Text and 2D-conditioning interfaces provide weak, ambiguous control over spatial transformations in image editing -- particularly under large object motions and…

cs.CV

The Token Is a Group Element: On Lie-Algebra Attention over Matrix Lie Groups

2026년 6월 18일

We place the attention token on the group: a token is an element $g_i$ of a matrix Lie group $G$ -- a bare transformation, with no feature payload and no extern…

cs.LGcs.CVcs.GRcs.ROmath.DG

Current World Models Lack a Persistent State Core

2026년 6월 18일

World models are increasingly regarded as a decisive step toward artificial general intelligence, yet modeling the physical world demands more than rendering co…

cs.CV

SSD: Spatially Speculative Decoding Accelerates Autoregressive Image Generation

2026년 6월 18일

Autoregressive models excel in visual generation by treating images as 1D sequences of discrete tokens, mirroring language modeling. However, this flattening di…

cs.CV

CalTennis: Large Multi-View Tennis Video Dataset and Benchmark of Monocular-to-3D Pose Estimation

2026년 6월 18일

The Caltech Tennis Dataset (CalTennis) is a large-scale video benchmark for evaluating monocular-to-3D pose estimation in the wild. CalTennis comprises over 11 …

cs.CV

The FID Lottery: Quantifying Hidden Randomness in Generative-Model Evaluation

2026년 6월 18일

The Frechet Inception Distance (FID) is the de facto arbiter of image generation, yet most papers report just a single number from a single trained model using …

cs.CV

VisDom: Sparse Novel View Synthesis with Visible Domain Constraint

2026년 6월 18일

Sparse novel view synthesis (NVS) remains challenging due to the ambiguity of recovering 3D geometry from few input views. While NeRF- and Gaussian Splatting (G…

cs.CV

StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs

2026년 6월 18일

Multimodal large language models (MLLMs) are increasingly deployed in personally and societally consequential settings, yet the visual cues that shape how these…

ICML 2026💻 코드공개cs.CLcs.CV

SARLO-80: Worldwide Slant SAR Language Optic Dataset 80cm

2026년 6월 18일

Multimodal foundation models have advanced rapidly thanks to large optical benchmarks, but comparable resources for synthetic aperture radar (SAR) remain limite…

cs.CVcs.AIcs.DB

HumanScale: Egocentric Human Video Can Outperform Real-Robot Data for Embodied Pretraining

2026년 6월 18일

Embodied foundation models are expected to benefit from data scaling like large language models, but face a much tighter data bottleneck. Teleoperated real-robo…

💻 코드공개cs.CV

S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

2026년 6월 18일

Real-world spatial intelligence requires reasoning over a continuous and evolving 3D world, yet existing VLMs and tool-augmented agents largely remain tied to s…

cs.CV

FreeStyle: Free Control of Style-Content Dual-Reference Generation from Community LoRA Mining

2026년 6월 18일

Style-content dual-reference generation aims to synthesize an image that preserves the structure and semantics of a content reference while adopting the style o…

💻 코드공개cs.CVcs.AI

Fast Human Attention Prediction for Fixation-guided Active Perception in Autonomous Navigation

2026년 6월 18일

Human visual attention relies on structured scanpaths to efficiently process scenes, yet instilling this behavior into robot autonomy is in its infancy and hind…

cs.ROcs.CV

How Fragile Are Training-Free AI-Generated Image Detectors? A Controlled Audit of Score Direction, Preprocessing, and Compression

2026년 6월 18일

Training-free detectors of AI-generated images promise generator-agnostic deployment without classifier training, yet their reported numbers are rarely compared…

cs.CV

Scalable Training of Spatially Grounded 2D Vision-Language Models for Radiology

2026년 6월 18일

We study how to train visually grounded vision-language models (VLMs) for radiology without manual spatial annotations. We introduce RefRad2D, a large-scale bil…

cs.CVcs.CLcs.LG

PCFootprint: A Large-Scale Dataset and Benchmark for Vectorized Building Footprint Extraction from Aerial LiDAR Point Clouds

2026년 6월 18일

Building footprint extraction is a fundamental task in photogrammetry, remote sensing, and computer vision. Recent image-based methods have achieved remarkable …

cs.CV

InfantFace: Detecting infant faces in neonatal clinical environments

2026년 6월 18일

Reliable localisation of the neonatal face is the first step for several video-camera based non-contact assessments such as pain and distress related facial exp…

cs.CV

Spectral Query-Key Product Weight Steering for Training-Free VLM Hallucination Mitigation

2026년 6월 18일

Vision-language models (VLMs) often generate fluent but visually unsupported descriptions, especially by mentioning objects absent from the image. We propose QK…

cs.CV

On the Redundancy of Timestep Embeddings in Diffusion Models

2026년 6월 18일

Diffusion models rely heavily on explicit timestep embeddings to modulate the denoising process across various noise scales. In this work, we challenge the nece…

cs.LGcs.CV

FlowBender: Feedback-Aware Training for Self-Correcting Conditional Flows

2026년 6월 18일

Conditional diffusion and flow models routinely fail to satisfy the very constraints that define their task. For instance, a depth-conditioned model often produ…

cs.CV

Geometry-Aware Superpixel Graph Transformer with Metadata for Skin Lesion Classification

2026년 6월 18일

Automated skin cancer classification from dermoscopic images remains challenging due to heterogeneous lesion structure, strong intra-class variability, and subt…

cs.CV

Reliability-Aware Prototype Calibration for Frozen Pose-Flow Video Anomaly Detection

2026년 6월 18일

Pose-flow video anomaly detectors are attractive for one-class surveillance because they provide likelihood-based rankings for tracked skeleton windows. However…

💻 코드공개cs.CV

Through the PRISM: Preference Representation in Intermediate States of Video Diffusion Models

2026년 6월 18일

Evaluating video generation with clean, pixel-based reward models disconnects evaluation from the noisy diffusion process and incurs massive VAE decoding costs.…

cs.CV

GEN-Guard: Correcting Generalization Failures for Deployable Federated Surgical AI

2026년 6월 18일

Federated Learning (FL) in surgical video AI enables collaborative model training without sharing sensitive data. However, standard evaluation practices - selec…

cs.CV

CUPID: Reconstructing UV Texture Maps for Interpretable Person-of-Interest Deepfake Detection

2026년 6월 18일

Deepfakes targeting a high-profile individual, known as Person-of-Interest (POI), are a threat to modern democracies and societies. Current POI deepfake detecti…

💻 코드공개cs.CV

CMDS-AD: Cross-Modal Dual-Stream Decoupling for Few-Shot Anomaly Detection

2026년 6월 18일

Few-shot anomaly detection remains challenging due to limited training data. Multi-modal anomaly detection (MAD) offers a viable solution, leveraging 3D geometr…

ECCV 2026cs.CV

Integrating national forest inventory, airborne lidar, and satellite imagery for wall-to-wall mapping of forest structure with computer vision

2026년 6월 18일

Remote sensing is increasingly relied upon to deliver actionable science for forest and wildfire risk management across large landscapes. Wall-to-wall, annually…

cs.LGcs.CV

U$^2$Mamba: A Two-level Nested U-structure Mamba for Salient Object Detection

2026년 6월 18일

Mamba-based models have emerged as a promising alternative for salient object detection (SOD), offering significant advantages in modeling long sequences. Howev…

💻 코드공개cs.CV

Efficiently Linking Real Scenes with Synthetic Data Generation for AI-based Cognitive Robotics and Computer Vision Applications

2026년 6월 18일

AI vision models are a driving factor for the potential use case scenarios of cognitive robotics within in the industry and household applications. A large arra…

cs.ROcs.CV

Single-Stage Hierarchical Rectification for Weakly Supervised Histopathology Segmentation

2026년 6월 18일

Existing weakly supervised semantic segmentation (WSSS) methods in computational pathology rely on a multi-stage paradigm: class activation map (CAM) generation…

💻 코드공개cs.CV

SPOT-E: Test-Time Entropy Shaping with Visual Spotlights for Frozen VLMs

2026년 6월 18일

Vision-language models (VLMs) often underperform on evidence intensive tasks because decisive visual evidence are small, localized, and easy to overlook, leadin…

💻 코드공개cs.CVcs.AI

BAFIS: Dataset + Framework to assess occupational Bias and Human Preference in modern Text-to-image Models

2026년 6월 18일

Generative artificial intelligence has the potential to improve productivity and transform the production of creative content. However, existing research indica…

WACV 2026cs.CV

DeepForestVisionV2: Ecology-Driven Taxonomy Expansion for Camera-Trap Monitoring in African Tropical Forests

2026년 6월 18일

Camera-trap monitoring in African tropical forests increasingly extends beyond closed-canopy interiors to riverbanks, clearings, and park edges. Among available…

cs.CVq-bio.QM

Evaluation of Image Matching for Art Skills Assessment

2026년 6월 18일

While some individuals possess a natural talent for drawing, mastering this skill requires dedicated training and practice. Determining one's skill in the art o…

cs.CV

Distill Once, Adapt Life-Long: Exploring Dataset Distillation for Continual Test-Time Adaptation

2026년 6월 18일

Continual Test-Time Adaptation (CTTA) aims to maintain model performance under evolving target domains by adapting online without labeled data. However, practic…

ECCV 2026💻 코드공개cs.CV

HilDA: Hierarchical Distillation with Diffusion for Advancing Self-Supervised LiDAR Pre-trainin

2026년 6월 18일

Leveraging Vision Foundation Models (VFMs) for camera-to-LiDAR knowledge distillation offers a promising solution to the scarcity of annotated data needed to re…

ECCV 2026cs.CVcs.AIcs.RO

Evaluating and Enhancing Negation Comprehension in Remote Sensing MLLMs

2026년 6월 18일

Multimodal Large Language Models (MLLMs) have demonstrated remarkable success in various Remote Sensing (RS) tasks. However, their ability to comprehend negatio…

ECCV 2026cs.CVcs.AI